Silk Proteins

ABSTRACT

The present invention provides silk proteins, as well as nucleic acids encoding these proteins. The present invention also provides recombinant cells and/or organisms which synthesize silk proteins. Silk proteins of the invention can be used for a variety of purposes such as in the manufacture of personal care products, plastics, textiles, and biomedical products.

FIELD OF THE INVENTION

The present invention relates to silk proteins, as well as nucleic acidsencoding such proteins. The present invention also relates torecombinant cells and/or organisms which synthesize silk proteins. Silkproteins of the invention can be used for a variety of purposes such asin the production of personal care products, plastics, textiles, andbiomedical products.

BACKGROUND OF THE INVENTION

Silks are fibrous protein secretions that exhibit exceptional strengthand toughness and as such have been the target of extensive study. Silksare produced by over 30,000 species of spiders and by many insects. Veryfew of these silks have been characterised, with most researchconcentrating on the cocoon silk of the domesticated silkworm, Bombyxmori and on the dragline silk of the orb-weaving spider Nephilaclavipes.

In the Lepidoptera and spider, the fibroin silk genes code for proteinsthat are generally large with prominent hydrophilic terminal domains ateither end spanning an extensive region of alternating hydrophobic andhydrophilic blocks (Bini et al., 2004). Generally these proteinscomprise different combinations of crystalline arrays of β-pleatedsheets loosely associated with β-sheets, β-spirals, α-helices andamorphous regions (see Craig and Riekel, 2002 for review).

As silk fibres represent some of the strongest natural fibres known,they have been subject to extensive research in attempts to reproducetheir synthesis. However, a recurrent problem with expression ofLepidopteran and spider fibroin genes has been low expression rates invarious recombinant expression systems due to the combination of therepeating nucleotide motifs in the silk gene that lead to deleteriousrecombination events, the large gene size and the small number of codonsused for each amino acid in the gene which leads to depletion of tRNApools in the host cells. Recombinant expression leads to difficultiesduring translation such as translational pauses as a result of codonpreferences and codon demands and extensive recombination rates leadingto truncation of the genes. Shorter, less repetitive sequences wouldavoid many of the problems associated with silk gene expression to date.

In contrast to the extensive knowledge that has accumulated about theLepidopteran (in particular the cocoon silk of Bombyx mori) and spider(in particular the dragline silk of Nephila clavipes) little is knownabout the chemical composition and molecular organisation of otherinsect silks.

In the early 1960s, the silk of the aculeate Hymenopteran was shown tohave an alpha-helical structure by X-ray diffraction patterns obtainedfrom silk fibres drawn from the salivary gland of honeybee larvae(Rudall, 1962). As well as demonstrating that this silk was helical, thepatterns obtained were indicative of a coiled-coil system ofalpha-helical chains (Atkins, 1967). Similar X-ray diffraction patternshave been obtained for cocoon silks from other Aculeata speciesincluding the wasp Pseudopompilus humbolti (Rudall, 1962) and thebumblebee, Bombus lucorum (Lucas and Rudall, 1967).

In contrast to the alpha-helical structure described in the Aculeatasilks, the silks characterised from a related clade to the aculeata, theIchneumonoidea, have parallel-β structures. X-ray diagrams for fourexamples of this structure have been described in the Braconidae(Cotesia(=Apenteles) glomerate; Cotesia(=Apenteles) gonopterygis;Apenteles bignelli) and three in Ichneumonidae (Dusona sp.; Phytodietrissp.; Branchus femoralis) (Lucas and Rudall, 1967). In addition thesequence of a single Braconidae (Cotesia glomerate) silk has beendescribed (Genbank database accession number AB 188680; Yamada et al.,2004). This partial protein sequence consists of a highly conserved 28X-asparagine repeat (where X is alanine or serine) and is not predictedto contain coiled coil forming heptad repeats. Extensive analysis of theamino acid composition of the cocoon silks of the Braconidae has shownthat the silks from the subfamily Microgastrinae are unique in theirhigh asparagine and serine content (Lucas et al., 1960; Quicke et al.,2004). Related subfamilies produce silks with significantly differentamino acid compositions suggesting that the Microgastrinae silks haveevolved specifically in this subfamily (Yamada et al., 2004). Thepartial cDNA of Cotesia glomerata was isolated using PCR primersdesigned from sequence obtained from internal peptides derived fromisolated cocoon silk proteins. The predicted amino acid composition ofthis partial sequence closely resembles the amino acid composition ofthe extensively washed silk from this species.

The structure of many of the silks within other non aculeate Apocritaand within the rest of the Hymenoptera (Symphata) are most commonlyparallel-β sheets, with both collagen-like and polyglycine silksproduced by the Tenthredinidae (Lucas and Rudall, 1967).

Honeybee silk proteins are synthesised in the middle of the final instarand can be imaged as a mix of depolymerised silk proteins (Silva-Zacarinet al., 2003). As the instar progresses, water is removed from the glandand dehydration results in the polymerisation of the silk protein toform well-organised and insoluble silk filaments labelled tactoids(Silva-Zacarin et al., 2003). Progressive dehydration leads to furtherreorganisation of the tactoids (Silva-Zacarin et al., 2003) and possiblynew inter-filamentary bonding between filaments (Rudall, 1962). Electronmicroscope images of fibrils isolated from the honeybee silk gland showstructures of approximately 20-25 angstroms diameter (Flower andKenchington, 1967). This value is consistent with three-, four-, orfive-stranded coiled coils.

The amino acid composition of the silks of various aculeate Hymenopteranspecies was determined by Lucas and Rudall (1967) and found to containhigh contents of alanine, serine, the acid residues, aspartic acid andglutamic acids, and reduced amounts of glycine in comparison toclassical fibroins. It was considered that the helical content of theaculeate Hymenoptera silk was a consequence of a reduced glycine contentand increased content of acidic residues (Rudall and Kenchington, 1971).

Little is known about the larval silk of the lacewings (Order:Neuroptera). The cocoon is comprised of two layers, an inner solid layerand an outer fibrous layer. Previously the cocoon was described as beingcomprised of a cuticulin silk (Rudall and Kenchington, 1971), adescription that only related to the inner solid layer. LaMunyon (1988)described a substance excreted from the malphigian tubules that made upthe outer fibres. After deposition of this layer, the solid inner wallwas constructed from secretions from the epithelial cells in the highlyvillous lumen (LaMunyon, 1988).

It is also known that lacewing larva produce a proteinaceous adhesivesubstance from the malpighian tubules throughout all instars to stickthe larvae to substrates, to glue items of camouflage on to the larvae'sback or to entrap prey (Speilger, 1962). In the genus Lomamyia(Bethothidae), the larvae produce the silk and adhesive substance at thesame time and it has been postulated that these two substances may wellbe the same product (Speilger, 1962). The adhesive secretion is highlysoluble and is also thought to be associated with defense againstpredators (LaMunyon & Adams, 1987).

Considering the unique properties of silks produced by insects such asHymenopterans and Neuropterans, there is a need for the identificationof novel nucleic acids encoding silk proteins from these organisms.

SUMMARY OF THE INVENTION

The present inventors have identified numerous silk proteins frominsects. These silk proteins are surprisingly different to other knownsilk proteins in their primary sequence, secondary structure and/oramino acid content.

Thus, in a first aspect the present invention provides a substantiallypurified and/or recombinant silk polypeptide, wherein at least a portionof the polypeptide has a coiled coil structure.

As known in the art, coiled coil structures of polypeptides arecharacterized by heptad repeats represented by the consensus sequence(abcdefg)_(n), with generally hydrophobic residues in position a and d,and generally polar residues at the remaining positions. Surprisingly,the heptads of the polypeptides of the present invention have a novelcomposition when viewed collectively—with an unusually high abundance ofalanine in the ‘hydrophobic’ heptad positions a and d. Additionally,there are high levels of small polar residues in these positions.Furthermore, the e position also has high levels of alanine and smallhydrophobic residues.

Accordingly, in a particularly preferred embodiment, the portion of thepolypeptide that has a coiled coil structure comprises at least 10copies of the heptad sequence abcdefg, and at least 25% of the aminoacids at positions a and d are alanine residues.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 10 copies of the heptadsequence abcdefg, and at least 25% of the amino acids at positions a, dand e are alanine residues.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 10 copies of the heptadsequence abcdefg, and at least 25% of the amino acids at position a arealanine residues.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 10 copies of the heptadsequence abcdefg, and at least 25% of the amino acids at position d arealanine residues.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 10 copies of the heptadsequence abcdefg, and at least 25% of the amino acids at position e arealanine residues.

In a particularly preferred embodiment, the at least 10 copies of theheptad sequence are contiguous.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 5 copies of the heptadsequence abcdefg, and at least 15% of the amino acids at positions a andd are alanine residues.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 5 copies of the heptadsequence abcdefg, and at least 15% of the amino acids at positions a, dand e are alanine residues.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 5 copies of the heptadsequence abcdefg, and at least 15% of the amino acids at position a arealanine residues.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 5 copies of the heptadsequence abcdefg, and at least 15% of the amino acids at position d arealanine residues.

In a further preferred embodiment, the portion of the polypeptide thathas a coiled coil structure comprises at least 5 copies of the heptadsequence abcdefg, and at least 15% of the amino acids at position e arealanine residues.

In a particularly preferred embodiment, the at least 5 copies of theheptad sequence are contiguous.

In one embodiment, the polypeptide comprises a sequence selected from:

i) an amino acid sequence as provided in any one of SEQ ID NO: 1, SEQ IDNO:2, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:40, SEQ ID NO:41, SEQ IDNO:56, and SEQ ID NO:57;

ii) an amino acid sequence which is at least 30% identical to any one ormore of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:22, SEQ ID NO:23, SEQ IDNO:40, SEQ ID NO:41, SEQ ID NO:56, and SEQ ID NO:57; and

iii) a biologically active fragment of i) or ii).

In another embodiment, the polypeptide comprises a sequence selectedfrom:

i) an amino acid sequence as provided in any one of SEQ ID NO:3, SEQ IDNO:4, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:42, SEQ ID NO:43, SEQ IDNO:58, and SEQ ID NO:59;

ii) an amino acid sequence which is at least 30% identical to any one ormore of SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:24, SEQ ID NO:25, SEQ IDNO:42, SEQ ID NO:43, SEQ ID NO:58, and SEQ ID NO:59; and

iii) a biologically active fragment of i) or ii).

In another embodiment, the polypeptide comprises a sequence selectedfrom:

i) an amino acid sequence as provided in any one of SEQ ID NO:5, SEQ IDNO:6, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:44, SEQ ID NO:45, SEQ IDNO:60, and SEQ ID NO:61;

ii) an amino acid sequence which is at least 30% identical to any one ormore of SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:26, SEQ ID NO:27, SEQ IDNO:44, SEQ ID NO:45, SEQ ID NO:60, and SEQ ID NO:61; and

iii) a biologically active fragment of i) or ii).

In another embodiment, the polypeptide comprises a sequence selectedfrom:

i) an amino acid sequence as provided in any one of SEQ ID NO:7, SEQ IDNO:8, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:46, SEQ ID NO:47, SEQ IDNO:62, and SEQ ID NO:63;

ii) an amino acid sequence which is at least 30% identical to any one ormore of SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:28, SEQ ID NO:29, SEQ IDNO:46, SEQ ID NO:47, SEQ ID NO:62, and SEQ ID NO:63; and

iii) a biologically active fragment of i) or ii).

In a further embodiment, the polypeptide comprises a sequence selectedfrom:

i) an amino acid sequence as provided in SEQ ID NO:72 or SEQ ID NO:73;

ii) an amino acid sequence which is at least 30% identical to SEQ IDNO:72 and/or SEQ ID NO:73; and

iii) a biologically active fragment of i) or ii).

Further silk proteins which co-associate with proteins of the firstaspect have been identified. One of these proteins (SEQ ID NO:10) ispredicted to have 41% alpha-helical, 8% beta-sheet and 50% loopsecondary structure by PROFsec, and therefore is classified as a mixedstructure protein. MARCOIL analysis of this protein predicted only ashort region of heptad repeats characteristic of proteins with a coiledcoil structure.

Accordingly, in a second aspect, the present invention provides asubstantially purified and/or recombinant silk polypeptide whichcomprises a sequence selected from:

i) an amino acid sequence as provided in any one of SEQ ID NO:9, SEQ IDNO:10 and SEQ ID NO:30;

ii) an amino acid sequence which is at least 30% identical to any one ormore of SEQ ID NO:9, SEQ ID NO:10 and SEQ ID NO:30; and

iii) a biologically active fragment of i) or ii).

Without wishing to be limited by theory, it appears that four proteinsof the first aspect become intertwined to form a bundle with helicalaxes almost parallel to each other, and this bundle extends axially intoa fibril. Furthermore, it is predicted that in at least some speciessuch as the honeybee and bumblebee the proteins of the second aspect actas a “glue” assisting in binding various bundles of coiled coil proteinsof the first aspect together to form a fibrous protein complex. However,silk fibers and copolymers can still be formed without a polypeptide ofsecond aspect.

In a preferred embodiment, a polypeptide of the invention can bepurified from, or is a mutant of a polypeptide purified from, a speciesof Hymenoptera or Neuroptera. Preferably, the species of Hymenoptera isApis mellifera, Oecophylla smaragdina, Myrmecia foricata or Bombusterrestris. Preferably, the species of Neuroptera is Mallada signata.

In another aspect, the present invention provides a polypeptide of theinvention fused to at least one other polypeptide.

In a preferred embodiment, the at least one other polypeptide isselected from the group consisting of: a polypeptide that enhances thestability of a polypeptide of the present invention, a polypeptide thatassists in the purification of the fusion protein, and a polypeptidewhich assists in the polypeptide of the invention being secreted from acell (for example secreted from a plant cell).

In another aspect, the present invention provides an isolated and/orexogenous polynucleotide which encodes a silk polypeptide, wherein atleast a portion of the polypeptide has a coiled coil structure.

In one embodiment, the polynucleotide comprises a sequence selectedfrom:

i) a sequence of nucleotides as provided in any one of SEQ ID NO: 11,SEQ ID NO:12, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:48, SEQ ID NO:49,SEQ ID NO:64, and SEQ ID NO:65;

ii) a sequence of nucleotides encoding a polypeptide of the invention,

iii) a sequence of nucleotides which is at least 30% identical to anyone or more of SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:31, SEQ ID NO:32,SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:64, and SEQ ID NO:65, and

iv) a sequence which hybridizes to any one of i) to iii) under stringentconditions.

In another embodiment, the polynucleotide comprises a sequence selectedfrom:

i) a sequence of nucleotides as provided in any one of SEQ ID NO:13, SEQID NO:14, SEQ ID NO:33, SEQ ID NO:34, SEQ ID NO:50, SEQ ID NO:51, SEQ IDNO:66, and SEQ ID NO:67;

ii) a sequence of nucleotides encoding a polypeptide of the invention,

iii) a sequence of nucleotides which is at least 30% identical to anyone or more of SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:33, SEQ ID NO:34,SEQ ID NO:50, SEQ ID NO:51, SEQ ID NO:66, and SEQ ID NO:67, and

iv) a sequence which hybridizes to any one of i) to iii) under stringentconditions.

In another embodiment, the polynucleotide comprises a sequence selectedfrom:

i) a sequence of nucleotides as provided in any one of SEQ ID NO:15, SEQID NO:16, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:52, SEQ ID NO:53, SEQ IDNO:68, and SEQ ID NO:69;

ii) a sequence of nucleotides encoding a polypeptide of the invention,

iii) a sequence of nucleotides which is at least 30% identical to anyone or more of SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:35, SEQ ID NO:36,SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:68, and SEQ ID NO:69, and

iv) a sequence which hybridizes to any one of i) to iii) under stringentconditions.

In a further embodiment, the polynucleotide comprises a sequenceselected from:

i) a sequence of nucleotides as provided in any one of SEQ ID NO:17, SEQID NO:18, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:54, SEQ ID NO:55, SEQ IDNO:70, SEQ ID NO:71 and SEQ ID NO:76;

ii) a sequence of nucleotides encoding a polypeptide of the invention,

iii) a sequence of nucleotides which is at least 30% identical to anyone or more of SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:37, SEQ ID NO:38,SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:70, SEQ ID NO:71 and SEQ ID NO:76,and

iv) a sequence which hybridizes to any one of i) to iii) under stringentconditions.

In another embodiment, the polynucleotide comprises a sequence selectedfrom:

i) a sequence of nucleotides as provided in SEQ ID NO:74 or SEQ IDNO:75;

ii) a sequence of nucleotides encoding a polypeptide of the invention,

iii) a sequence of nucleotides which is at least 30% identical to SEQ IDNO:74 and/or SEQ ID NO:75, and

iv) a sequence which hybridizes to any one of i) to iii) under stringentconditions.

In a further aspect, the present invention provides an isolated and/orexogenous polynucleotide, the polynucleotide comprising a sequenceselected from:

i) a sequence of nucleotides as provided in any one of SEQ ID NO:19, SEQID NO:20, SEQ ID NO:21, and SEQ ID NO:39;

ii) a sequence of nucleotides encoding a polypeptide of the invention,

iii) a sequence of nucleotides which is at least 30% identical to anyone or more of SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, and SEQ IDNO:39, and

iv) a sequence which hybridizes to any one of i) to iii) under stringentconditions.

In a preferred embodiment, a polynucleotide can be isolated from, or isa mutant of a polynucleotide isolated from, a species of Hymenoptera orNeuroptera. Preferably, the species of Hymenoptera is Apis mellifera,Oecophylla smaragdina, Myrmecia foricata or Bombus terrestris.Preferably, the species of Neuroptera is Mallada signata.

In a further aspect, the present invention provides a vector comprisingat least one polynucleotide of the invention.

Preferably, the vector is an expression vector.

In another aspect, the present invention provides a host cell comprisingat least one polynucleotide of the invention, and/or at least one vectorof the invention.

The host cell can be any type of cell. Examples include, but are notlimited to, a bacterial, yeast or plant cell.

Also provided is a process for preparing a polypeptide according to theinvention, the process comprising cultivating a host cell of theinvention, or a vector of the invention, under conditions which allowexpression of the polynucleotide encoding the polypeptide, andrecovering the expressed polypeptide.

It is envisaged that transgenic plants will be particularly useful forthe production of polypeptides of the invention. Thus, in yet anotheraspect, the present provides a transgenic plant comprising an exogenouspolynucleotide, the polynucleotide encoding at least one polypeptide ofthe invention.

In another aspect, the present invention provides a transgenic non-humananimal comprising an exogenous polynucleotide, the polynucleotideencoding at least one polypeptide of the invention.

In yet another aspect, the present invention provides an antibody whichspecifically binds a polypeptide of the invention.

In a further aspect, the present invention provides a silk fibercomprising at least one polypeptide of the invention.

Preferably, the polypeptide is a recombinant polypeptide.

In an embodiment, at least some of the polypeptides are crosslinked. Inan embodiment, at least some of the lysine residues of the polypeptidesare crosslinked.

In another aspect, the present invention provides a copolymer comprisingat least two polypeptides of the invention.

Preferably, the polypeptides are recombinant polypeptides.

In an embodiment, the copolymer comprises at least four differentpolypeptide of the first aspect. In another embodiment, the copolymerfurther comprises a polypeptide of the second aspect.

In an embodiment, at least some of the polypeptides are crosslinked. Inan embodiment, at least some of the lysine residues of the polypeptidesare crosslinked.

As the skilled addressee will appreciate, the polypeptides of theinvention have a wide variety of uses as is known in the art for othertypes of silk proteins. Thus, in a further aspect, the present inventionprovides a product comprising at least one polypeptide of the invention,a silk fiber of the invention and/or a copolymer of the invention.

Examples of products include, but are not limited to, personal careproducts, textiles, plastics, and biomedical products.

In yet a further aspect, the present invention provides a compositioncomprising at least one polypeptide of the invention, a silk fiber ofthe invention and/or a copolymer of the invention, and one or moreacceptable carriers.

In one embodiment, the composition further comprises a drug.

In another embodiment, the composition is used as a medicine, in amedical device or a cosmetic.

In another aspect, the present invention provides a compositioncomprising at least one polynucleotide of the invention, and one or moreacceptable carriers.

In a preferred embodiment, a composition, silk fiber, copolymer and/orproduct of the invention does not comprise a royal jelly proteinproduced by an insect.

In a further aspect, the present invention provides a method of treatingor preventing a disease, the method comprising administering acomposition comprising a drug for treating or preventing the disease anda pharmaceutically acceptable carrier, wherein the pharmaceuticallyacceptable carrier is selected from at least one polypeptide of theinvention, a silk fiber of the invention and/or a copolymer of theinvention.

In yet another aspect, the present invention provides for the use of atleast one polypeptide of the invention, a silk fiber of the inventionand/or a copolymer of the invention, and a drug, for the manufacture ofa medicament for treating or preventing a disease.

In a further aspect, the present invention provides a kit comprising atleast one polypeptide of the invention, at least one polynucleotide ofthe invention, at least one vector of the invention, at least one silkfiber of the invention and/or a copolymer of the invention.

Preferably, the kit further comprises information and/or instructionsfor use of the kit.

As will be apparent, preferred features and characteristics of oneaspect of the invention are applicable to many other aspects of theinvention.

Throughout this specification the word “comprise”, or variations such as“comprises” or “comprising”, will be understood to imply the inclusionof a stated element, integer or step, or group of elements, integers orsteps, but not the exclusion of any other element, integer or step, orgroup of elements, integers or steps.

The invention is hereinafter described by way of the followingnon-limiting Examples and with reference to the accompanying figures.

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS

FIG. 1. Fourier transform infrared spectra of the amide I and II regionsof the silks: 1) honeybee silk, 2) bumblebee silk, 3) bulldog ant silk,4) weaver ant silk 5) lacewing larval silk. All the silks have spectraexpected of helical proteins. The Hymenopteran silks (ants and bees)have spectral maxima at 1645-1646 cm⁻¹ (labelled), shifted approximately10 cm⁻¹ lower than a classical alpha-helical signal and broadened, as istypical of coiled-coil proteins (Heimburg et al., 1999).

FIG. 2. Comparison of amino acid composition of SDS washed honeybeebrood comb silk with amino acid composition of Xenospira proteins(namely, Xenospira1, Xenospira2, Xenospira3 and Xenospira4) (equimolaramounts totaling 65%) and Xenosin (35%).

FIG. 3. Comparison of amino acid composition of silk with amino acidcomposition predicted from proteins encoded by silk genes.

FIG. 4. Prediction of coiled coil regions in honeybee silk proteins.COILS is a program that compares a sequence to a database of knownparallel two-stranded coiled-coils and derives a similarity score. Bycomparing this score to the distribution of scores in globular andcoiled-coil proteins, the program then calculates the probability thatthe sequence will adopt a coiled-coil conformation as described in Lupaset al. (1991). Using a window size of 28 this program predicts thefollowing numbers of residues exist in each protein in coiled coildomains: Xenospira3: 77; Xenospira4: 35; Xenospira1: 28; Xenospira2: 80.

FIG. 5. Alignment of honey bee silk proteins showing MARCOIL predictionof major heptads that form a coiled-coil structure. Heptad sequences areshown above the amino acids, and alanine residues in positions a and dare highlighted.

FIG. 6. Alignment of Marciol predicted coiled coil regions ofhymenopteran (bees and ants) silk proteins showing the heptad positionassignment. Amel, honeybee; BB, bumblebee; BA, bulldog ant; WA, weaverant; F1-4, silk fibroins 1-4. Heptad sequences are shown above the aminoacids, and alanine residues in positions a, d and e are highlighted.

FIG. 7. The amino acid character of heptad positions in the predictedcoiled coil regions of the Mallada signata larval silk protein and theorthologous clusters of the Hymenopteran silk proteins.

FIG. 8. SDS polyacrylamide gel electrophoresis of late last instarsalivary gland proteins. Proteins were identified after tryptic digestand analysis of mass spectral data set using Agilent's Spectrum Millsoftware to match the data with predictions of protein sequences fromproteins identified from cDNA sequences. The software generated scoresfor the quality of each match between experimentally observed sets ofmasses of fragments of peptides and the predictions of fragments thatmight be generated according to the sequences of proteins in a provideddatabase. All the sequence matches shown here received scores greaterthan 20 by the Spectrum Mill software, where a score of 20 would besufficient for automatic, confident acceptance of a valid match.

FIG. 9. Parsimony analysis of the coiled coil region of silk proteins.The relatedness of the four coiled-coil proteins suggests that the genesevolved from a common ancestor predating the divergence of theEuaculeata. The area bound by the dashed line indicates variation thatoccurred before the ants and wasps (Vespoidea) diverged from the bees(Apoidea) in the Late Jurassic (155 myrs; Grimaldi and Engel, 2005).Numbers indicating bootstrap values from 1000 iterations are shown.

FIG. 10. A) Apis mellifera silk proteins identified by mass spectralanalysis of peptides generated from bee silk after digestion withtrypsin. Shading indicates peptides identified by the mass spectralanalysis. All the sequence matches shown here received scores greaterthan 20 by the Spectrum Mill software, where a score of 20 would besufficient for automatic, confident acceptance of a valid match.

B) Full length amino sequences of bumblebee, bulldog ant, weaver andlacewing silk proteins.

FIG. 11. Open reading frames encoding honeybee, bumblebee, bulldog ant,weaver ant and lacewing silk proteins.

FIG. 12. Sequence of gene encoding Xenosin. Entire coding sequenceprovided which is interrupted by a single intron (highlighted).

FIG. 13. Expression of silk protein in tobacco. Detection of histidinetagged proteins after western blot analysis of proteins from: 1. E. colitransformed with empty expression vector, 2. E. coli transformed withexpression vector containing AmelF4 (Xenospira4) coding region, 3.tobacco transformed with empty expression vector, 4. tobacco transformedwith expression vector containing AmelF4 coding region.

FIG. 14. Fibres made from recombinant honeybee silk proteins showingbirefringent threads. Birefringence indicates structure is present inthe threads. Different recombinant honeybee threads are shown in eachpanel A-D, and recombinant lacewing thread is shown in panel E.

KEY TO THE SEQUENCE LISTING

SEQ ID NO:1—Honeybee silk protein termed herein Xenospira1 (also termedherein AmelF1) (minus signal peptide).SEQ ID NO:2—Honeybee silk protein termed herein Xenospira1.SEQ ID NO:3—Honeybee silk protein termed herein Xenospira2 (also termedherein AmelF2) (minus signal peptide).SEQ ID NO:4—Honeybee silk protein termed herein Xenospira2.SEQ ID NO:5—Honeybee silk protein termed herein Xenospira3 (also termedherein AmelF3) (minus signal peptide).SEQ ID NO:6—Honeybee silk protein termed herein Xenospira3.SEQ ID NO:7—Honeybee silk protein termed herein Xenospira4 (also termedherein AmelF4) (minus signal peptide).SEQ ID NO:8—Honeybee silk protein termed herein Xenospira4.SEQ ID NO:9—Honeybee silk protein termed herein Xenosin (also termedherein AmelSA1) (minus signal peptide).SEQ ID NO:10—Honeybee silk protein termed herein Xenosin.SEQ ID NO:11—Nucleotide sequence encoding honeybee silk proteinXenospira1 (minus region encoding signal peptide).SEQ ID NO:12—Nucleotide sequence encoding honeybee silk proteinXenospira1.SEQ ID NO:13—Nucleotide sequence encoding honeybee silk proteinXenospira2 (minus region encoding signal peptide).SEQ ID NO: 14—Nucleotide sequence encoding honeybee silk proteinXenospira2.SEQ ID NO:15—Nucleotide sequence encoding honeybee silk proteinXenospira3 (minus region encoding signal peptide).SEQ ID NO:16—Nucleotide sequence encoding honeybee silk proteinXenospira3.SEQ ID NO:17—Nucleotide sequence encoding honeybee silk proteinXenospira4 (minus region encoding signal peptide).SEQ ID NO: 18—Nucleotide sequence encoding honeybee silk proteinXenospira4.SEQ ID NO:19—Nucleotide sequence encoding honeybee silk protein Xenosin(minus region encoding signal peptide).SEQ ID NO:20—Nucleotide sequence encoding honeybee silk protein Xenosin.SEQ ID NO:21—Gene sequence encoding honeybee silk protein Xenosin.SEQ ID NO:22—Bumblebee silk protein termed herein BBF1 (minus signalpeptide).SEQ ID NO:23—Bumblebee silk protein termed herein BBF1.SEQ ID NO:24—Bumblebee silk protein termed herein BBF2 (minus signalpeptide).SEQ ID NO:25—Bumblebee silk protein termed herein BBF2.SEQ ID NO:26—Bumblebee silk protein termed herein BBF3 (minus signalpeptide).SEQ ID NO:27—Bumblebee silk protein termed herein BBF3.SEQ ID NO:28—Bumblebee silk protein termed herein BBF4 (minus signalpeptide).SEQ ID NO:29—Bumblebee silk protein termed herein BBF4.SEQ ID NO:30—Partial amino acid sequence of bumblebee silk proteintermed herein BBSA1.SEQ ID NO:31—Nucleotide sequence encoding bumblebee silk protein BBF1(minus region encoding signal peptide).SEQ ID NO:32—Nucleotide sequence encoding bumblebee silk protein BBF1.SEQ ID NO:33—Nucleotide sequence encoding bumblebee silk protein BBF2(minus region encoding signal peptide).SEQ ID NO:34—Nucleotide sequence encoding bumblebee silk protein BBF2.SEQ ID NO:35—Nucleotide sequence encoding bumblebee silk protein BBF3(minus region encoding signal peptide).SEQ ID NO:36—Nucleotide sequence encoding bumblebee silk protein BBF3.SEQ ID NO:37—Nucleotide sequence encoding bumblebee silk protein BBF4(minus region encoding signal peptide).SEQ ID NO:38—Nucleotide sequence encoding bumblebee silk protein BBF4.SEQ ID NO:39—Partial nucleotide sequence encoding bumblebee silk proteinBBSA1.SEQ ID NO:40—Bulldog ant silk protein termed herein BAF1 (minus signalpeptide).SEQ ID NO:41—Bulldog ant silk protein termed herein BAF1.SEQ ID NO:42—Bulldog ant silk protein termed herein BAF2 (minus signalpeptide).SEQ ID NO:43—Bulldog ant silk protein termed herein BAF2.SEQ ID NO:44—Bulldog ant silk protein termed herein BAF3 (minus signalpeptide).SEQ ID NO:45—Bulldog ant silk protein termed herein BAF3.SEQ ID NO:46—Bulldog ant silk protein termed herein BAF4 (minus signalpeptide).SEQ ID NO:47—Bulldog ant silk protein termed herein BAF4.SEQ ID NO:48—Nucleotide sequence encoding bulldog ant silk protein BAF1(minus region encoding signal peptide).SEQ ID NO:49—Nucleotide sequence encoding bulldog ant silk protein BAF1.SEQ ID NO:50—Nucleotide sequence encoding bulldog ant silk protein BAF2(minus region encoding signal peptide).SEQ ID NO:51—Nucleotide sequence encoding bulldog ant silk protein BAF2.SEQ ID NO:52—Nucleotide sequence encoding bulldog ant silk protein BAF3(minus region encoding signal peptide).SEQ ID NO:53—Nucleotide sequence encoding bulldog ant silk protein BAF3.SEQ ID NO:54—Nucleotide sequence encoding bulldog ant silk protein BAF4(minus region encoding signal peptide).SEQ ID NO:55—Nucleotide sequence encoding bulldog ant silk protein BAF4.SEQ ID NO:56—Weaver ant silk protein termed herein GAF1 (minus signalpeptide).SEQ ID NO:57—Weaver ant silk protein termed herein GAF1.SEQ ID NO:58—Weaver ant silk protein termed herein GAF2 (minus signalpeptide).SEQ ID NO:59—Weaver ant silk protein termed herein GAF2.SEQ ID NO:60—Weaver ant silk protein termed herein GAF3 (minus signalpeptide).SEQ ID NO:61—Weaver ant silk protein termed herein GAF3.SEQ ID NO:62—Weaver ant silk protein termed herein GAF4 (minus signalpeptide).SEQ ID NO:63—Weaver ant silk protein termed herein GAF4.SEQ ID NO:64—Nucleotide sequence encoding weaver ant silk protein GAF1(minus region encoding signal peptide).SEQ ID NO:65—Nucleotide sequence encoding weaver ant silk protein GAF1.SEQ ID NO:66—Nucleotide sequence encoding weaver ant silk protein GAF2(minus region encoding signal peptide).SEQ ID NO:67—Nucleotide sequence encoding weaver ant silk protein GAF2.SEQ ID NO:68—Nucleotide sequence encoding weaver ant silk protein GAF3(minus region encoding signal peptide).SEQ ID NO:69—Nucleotide sequence encoding weaver ant silk protein GAF3.SEQ ID NO:70—Nucleotide sequence encoding weaver ant silk protein GAF4(minus region encoding signal peptide).SEQ ID NO:71—Nucleotide sequence encoding weaver ant silk protein GAF4.SEQ ID NO:72—Lacewing silk protein termed herein MalF1 (minus signalpeptide).SEQ ID NO:73—Lacewing silk protein termed herein MalF1.SEQ ID NO:74—Nucleotide sequence encoding lacewing silk protein MalF1(minus region encoding signal peptide).SEQ ID NO:75—Nucleotide sequence encoding lacewing silk protein MalF1.SEQ ID NO:76—Nucleotide sequence encoding honeybee silk protein termedherein Xenospira4 codon-optimized for plant expression (beforesubcloning into pET14b and pVEC8).SEQ ID NO:77—Honeybee silk protein (Xenospira4) open reading frameoptimized for plant expression (without translational fusion).

DETAILED DESCRIPTION OF THE INVENTION General Techniques and Definitions

Unless specifically defined otherwise, all technical and scientificterms used herein shall be taken to have the same meaning as commonlyunderstood by one of ordinary skill in the art (e.g., in cell culture,molecular genetics, immunology, immunohistochemistry, protein chemistry,and biochemistry).

Unless otherwise indicated, the recombinant protein, cell culture, andimmunological techniques utilized in the present invention are standardprocedures, well known to those skilled in the art. Such techniques aredescribed and explained throughout the literature in sources such as, J.Perbal, A Practical Guide to Molecular Cloning, John Wiley and Sons(1984), J. Sambrook et al., Molecular Cloning: A Laboratory Manual, ColdSpring Harbour Laboratory Press (1989), T. A. Brown (editor), EssentialMolecular Biology: A Practical Approach, Volumes 1 and 2, IRL Press(1991), D. M. Glover and B. D. Hames (editors), DNA Cloning: A PracticalApproach, Volumes 1-4, IRL Press (1995 and 1996), and F. M. Ausubel etal. (editors), Current Protocols in Molecular Biology, Greene Pub.Associates and Wiley-Interscience (1988, including all updates untilpresent), Ed Harlow and David Lane (editors) Antibodies: A LaboratoryManual, Cold Spring Harbour Laboratory, (1988), and J. E. Coligan et al.(editors) Current Protocols in Immunology, John Wiley & Sons (includingall updates until present), and are incorporated herein by reference.

As used herein, the terms “silk protein” and “silk polypeptide” refer toa fibrous protein/polypeptide that can be used to produce a silk fibre,and/or a fibrous protein complex. Naturally occurring silk proteins ofthe invention form part of the brood comb silk of insects such ashoneybees, however, as described herein variants of these proteins couldreadily be produced which would perform the same function if expressedwithin an appropriate insect.

As used herein, a “silk fibre” refers to filaments comprising proteinsof the invention which can be woven into various items such as textiles.

As used herein, a “copolymer” is composition comprising two or more silkproteins of the invention. This term excludes naturally occurringcopolymers such as the brood comb of insects.

The term “plant” includes whole plants, vegetative structures (forexample, leaves, stems), roots, floral organs/structures, seed(including embryo, endosperm, and seed coat), plant tissue (for example,vascular tissue, ground tissue, and the like), cells and progeny of thesame.

A “transgenic plant” refers to a plant that contains a gene construct(“transgene”) not found in a wild-type plant of the same species,variety or cultivar. A “transgene” as referred to herein has the normalmeaning in the art of biotechnology and includes a genetic sequencewhich has been produced or altered by recombinant DNA or RNA technologyand which has been introduced into the plant cell. The transgene mayinclude genetic sequences derived from a plant cell. Typically, thetransgene has been introduced into the plant by human manipulation suchas, for example, by transformation but any method can be used as one ofskill in the art recognizes.

“Polynucleotide” refers to an oligonucleotide, nucleic acid molecule orany fragment thereof. It may be DNA or RNA of genomic or syntheticorigin, double-stranded or single-stranded, and combined withcarbohydrate, lipids, protein, or other materials to perform aparticular activity defined herein.

“Operably linked” as used herein refers to a functional relationshipbetween two or more nucleic acid (e.g., DNA) segments. Typically, itrefers to the functional relationship of transcriptional regulatoryelement to a transcribed sequence. For example, a promoter is operablylinked to a coding sequence, such as a polynucleotide defined herein, ifit stimulates or modulates the transcription of the coding sequence inan appropriate host cell. Generally, promoter transcriptional regulatoryelements that are operably linked to a transcribed sequence arephysically contiguous to the transcribed sequence, i.e., they arecis-acting. However, some transcriptional regulatory elements, such asenhancers, need not be physically contiguous or located in closeproximity to the coding sequences whose transcription they enhance.

The term “signal peptide” refers to an amino terminal polypeptidepreceding a secreted mature protein. The signal peptide is cleaved fromand is therefore not present in the mature protein. Signal peptides havethe function of directing and trans-locating secreted proteins acrosscell membranes. The signal peptide is also referred to as signalsequence.

As used herein, “transformation” is the acquisition of new genes in acell by the incorporation of a polynucleotide.

As used herein, the term “drug” refers to any compound that can be usedto treat or prevent a particular disease, examples of drugs which can beformulated with a silk protein of the invention include, but are notlimited to, proteins, nucleic acids, anti-tumor agents, analgesics,antibiotics, anti-inflammatory compounds (both steroidal andnon-steroidal), hormones, vaccines, labeled substances, and the like.

Polypeptides

By “substantially purified polypeptide” we mean a polypeptide that hasgenerally been separated from the lipids, nucleic acids, otherpolypeptides, and other contaminating molecules such as wax with whichit is associated in its native state. With the exception of otherproteins of the invention, it is preferred that the substantiallypurified polypeptide is at least 60% free, more preferably at least 75%free, and more preferably at least 90% free from other components withwhich it is naturally associated.

The term “recombinant” in the context of a polypeptide refers to thepolypeptide when produced by a cell, or in a cell-free expressionsystem, in an altered amount or at an altered rate compared to itsnative state. In one embodiment the cell is a cell that does notnaturally produce the polypeptide. However, the cell may be a cell whichcomprises a non-endogenous gene that causes an altered, preferablyincreased, amount of the polypeptide to be produced. A recombinantpolypeptide of the invention includes polypeptides which have not beenseparated from other components of the transgenic (recombinant) cell, orcell-free expression system, in which it is produced, and polypeptidesproduced in such cells or cell-free systems which are subsequentlypurified away from at least some other components.

The terms “polypeptide” and “protein” are generally used interchangeablyand refer to a single polypeptide chain which may or may not be modifiedby addition of non-amino acid groups. The terms “proteins” and“polypeptides” as used herein also include variants, mutants,modifications, analogous and/or derivatives of the polypeptides of theinvention as described herein.

The % identity of a polypeptide is determined by GAP (Needleman andWunsch, 1970) analysis (GCG program) with a gap creation penalty=5, anda gap extension penalty=0.3. The query sequence is at least 15 aminoacids in length, and the GAP analysis aligns the two sequences over aregion of at least 15 amino acids. More preferably, the query sequenceis at least 50 amino acids in length, and the GAP analysis aligns thetwo sequences over a region of at least 50 amino acids. More preferably,the query sequence is at least 100 amino acids in length and the GAPanalysis aligns the two sequences over a region of at least 100 aminoacids. Even more preferably, the query sequence is at least 250 aminoacids in length and the GAP analysis aligns the two sequences over aregion of at least 250 amino acids. Even more preferably, the GAPanalysis aligns the two sequences over their entire length.

As used herein a “biologically active” fragment is a portion of apolypeptide of the invention which maintains a defined activity of thefull-length polypeptide, namely the ability to be used to produce silk.Biologically active fragments can be any size as long as they maintainthe defined activity.

With regard to a defined polypeptide, it will be appreciated that %identity figures higher than those provided above will encompasspreferred embodiments. Thus, where applicable, in light of the minimum %identity figures, it is preferred that the polypeptide comprises anamino acid sequence which is at least 40%, more preferably at least 45%,more preferably at least 50%, more preferably at least 55%, morepreferably at least 60%, more preferably at least 65%, more preferablyat least 70%, more preferably at least 75%, more preferably at least80%, more preferably at least 85%, more preferably at least 90%, morepreferably at least 91%, more preferably at least 92%, more preferablyat least 93%, more preferably at least 94%, more preferably at least95%, more preferably at least 96%, more preferably at least 97%, morepreferably at least 98%, more preferably at least 99%, more preferablyat least 99.1%, more preferably at least 99.2%, more preferably at least99.3%, more preferably at least 99.4%, more preferably at least 99.5%,more preferably at least 99.6%, more preferably at least 99.7%, morepreferably at least 99.8%, and even more preferably at least 99.9%identical to the relevant nominated SEQ ID NO.

Amino acid sequence mutants of the polypeptides of the present inventioncan be prepared by introducing appropriate nucleotide changes into anucleic acid of the present invention, or by in vitro synthesis of thedesired polypeptide. Such mutants include, for example, deletions,insertions or substitutions of residues within the amino acid sequence.A combination of deletion, insertion and substitution can be made toarrive at the final construct, provided that the final polypeptideproduct possesses the desired characteristics.

Mutant (altered) polypeptides can be prepared using any technique knownin the art. For example, a polynucleotide of the invention can besubjected to in vitro mutagenesis. Such in vitro mutagenesis techniquesinclude sub-cloning the polynucleotide into a suitable vector,transforming the vector into a “mutator” strain such as the E. coli XL-1red (Stratagene) and propagating the transformed bacteria for a suitablenumber of generations. In another example, the polynucleotides of theinvention are subjected to DNA shuffling techniques as broadly describedby Harayama (1998). These DNA shuffling techniques may include genes ofthe invention possibly in addition to genes related to those of thepresent invention, such as silk genes from Hymenopteran or Neuropteranspecies other than the specific species characterized herein. Productsderived from mutated/altered DNA can readily be screened usingtechniques described herein to determine if they can be used as silkproteins.

In designing amino acid sequence mutants, the location of the mutationsite and the nature of the mutation will depend on characteristic(s) tobe modified. The sites for mutation can be modified individually or inseries, e.g., by (1) substituting first with conservative amino acidchoices and then with more radical selections depending upon the resultsachieved, (2) deleting the target residue, or (3) inserting otherresidues adjacent to the located site.

Amino acid sequence deletions generally range from about 1 to 15residues, more preferably about 1 to 10 residues and typically about 1to 5 contiguous residues.

Substitution mutants have at least one amino acid residue in thepolypeptide molecule removed and a different residue inserted in itsplace. The sites of greatest interest for substitutional mutagenesisinclude sites identified as important for function. Other sites ofinterest are those in which particular residues obtained from variousstrains or species are identical. These positions may be important forbiological activity. These sites, especially those falling within asequence of at least three other identically conserved sites, arepreferably substituted in a relatively conservative manner. Suchconservative substitutions are shown in Table 1 under the heading of“exemplary substitutions”.

As outlined above, a portion of some of the polypeptides of theinvention have a coiled coil structure. Coiled coil structures ofpolypeptides are characterized by heptad repeats represented by theconsensus sequence (abcdefg)_(n). In a preferred embodiment, the portionof the polypeptide that has a coiled coil structure comprises at least10 copies of the heptad sequence abcdefg, and at least 25% of the aminoacids at positions a and d are alanine residues.

TABLE 1 Exemplary substitutions Original Exemplary Residue SubstitutionsAla (A) val; leu; ile; gly; cys; ser; thr Arg (R) lys Asn (N) gln; hisAsp (D) glu Cys (C) Ser; thr; ala; gly; val Gln (Q) asn; his Glu (E) aspGly (G) pro; ala; ser; val; thr His (H) asn; gln Ile (I) leu; val; ala;met Leu (L) ile; val; met; ala; phe Lys (K) arg Met (M) leu; phe Phe (F)leu; val; ala Pro (P) gly Ser (S) thr; ala; gly; val; gln Thr (T) ser;gln; ala Trp (W) tyr Tyr (Y) trp; phe Val (V) ile; leu; met; phe; ala;ser; thr

In a preferred embodiment, the polypeptide that has a coiled coilstructure comprises at least 12 consecutive copies, more preferably atleast 15 consecutive copies, and even more preferably at least 18consecutive copies of the heptad. In further embodiments, thepolypeptide that has a coiled coil structure can have up to at least 28copies of the heptad. Typically, the copies of the heptad will betandemly repeated. However, they do not necessarily have to be perfecttandem repeats, for example, as shown in FIGS. 5 and 6 a few amino acidsmay be found between two heptads, or a few truncated heptads may befound (see, for example, Xenospira1 in FIG. 5).

Guidance regarding amino acid substitutions which can be made to thepolypeptides of the invention which have a coiled coil structure isprovided in FIGS. 5 and 6, as well as Tables 6 to 10. Where a predicteduseful amino acid substitution based on the experimental data providedherein is in anyway in conflict with the exemplary substitutionsprovided in Table 1 it is preferred that a substitution based on theexperimental data is used.

Coiled coil structures of polypeptides of the invention have a highcontent of alanine residues, particularly at amino acid positions a, dand e of the heptad. However, positions b, c, f and g also have a highfrequency of alanine residues. In a preferred embodiment, at least 15%of the amino acids at positions a, d and/or e of the heptads are alanineresidues, more preferably at least 25%, more preferably at least 30%,more preferably at least 40%, and even more preferably at least 50%. Ina further preferred embodiment, at least 25% of the amino acids at bothpositions a and d of the heptads are alanine residues, more preferablyat least 30%, more preferably at least 40%, and even more preferably atleast 50%. Furthermore, it is preferred that at least 15% of the aminoacids at positions b, c, f and g of the heptads are alanine residues,more preferably at least 20%, and even more preferably at least 25%.

Typically, the heptads will not comprise any proline or histidineresidues. Furthermore, the heptads will comprise few (1 or 2), if any,phenylalanine, methionine, tyrosine, cysteine, glycine or tryptophanresidues. Apart from alanine, common (for example greater than 5%, morepreferably greater than 10%) amino acids in the heptads include leucine(particularly at positions b and d), serine (particularly at positionsb, e and f), glutamic acid (particularly at positions c, e and d),lysine (particularly at positions b, c, d, f and g) as well as arginineat position g.

Polypeptides (and polynucleotides) of the invention can be purified(isolated) from a wide variety of Hymenopteran and Neuropteran species.Examples of Hymenopterans include, but are not limited to, any speciesof the Suborder Apocrita (bees, ants and wasps), which include thefollowing Families of insects; Chrysididae (cuckoo wasps), Formicidae(ants), Mutillidae (velvet ants), Pompilidae (spider wasps), Scoliidae,Vespidae (paper wasps, potter wasps, hornets), Agaonidae (fig wasps),Chalcididae (chalcidids), Eucharitidae (eucharitids), Eupelmidae(eupelmids), Pteromalidae (pteromalids), Evaniidae (ensign wasps),Braconidae, Ichneumonidae (ichneumons), Megachilidae, Apidae,Colletidae, Halictidae, and Melittidae (oil collecting bees). Examplesof Neuropterans include species from the following insect Families:Mantispidae, Chrysopidae (lacewings), Myrmeleontidae (antlions), andAscalaphidae (owlflies). Such further polypeptides (and polynucleotides)can be characterized using the same procedures described herein forsilks from Bombus terrestris, Myrmecia forficata, Oecophylla smaragdinaand Mallada signata.

Furthermore, if desired, unnatural amino acids or chemical amino acidanalogues can be introduced as a substitution or addition into thepolypeptides of the present invention. Such amino acids include, but arenot limited to, the D-isomers of the common amino acids,2,4-diaminobutyric acid, α-amino isobutyric acid, 4-aminobutyric acid,2-aminobutyric acid, 6-amino hexanoic acid, 2-amino isobutyric acid,3-amino propionic acid, ornithine, norleucine, norvaline,hydroxyproline, sarcosine, citrulline, homocitrulline, cysteic acid,t-butylglycine, t-butylalanine, phenylglycine, cyclohexylalanine,β-alanine, fluoro-amino acids, designer amino acids such as β-methylamino acids, Cα-methyl amino acids, Nα-methyl amino acids, and aminoacid analogues in general.

Also included within the scope of the invention are polypeptides of thepresent invention which are differentially modified during or aftersynthesis, e.g., by biotinylation, benzylation, glycosylation,acetylation, phosphorylation, amidation, derivatization by knownprotecting/blocking groups, proteolytic cleavage, linkage to an antibodymolecule or other cellular ligand, etc. These modifications may serve toincrease the stability and/or bioactivity of the polypeptide of theinvention.

Polypeptides of the present invention can be produced in a variety ofways, including production and recovery of natural polypeptides,production and recovery of recombinant polypeptides, and chemicalsynthesis of the polypeptides. In one embodiment, an isolatedpolypeptide of the present invention is produced by culturing a cellcapable of expressing the polypeptide under conditions effective toproduce the polypeptide, and recovering the polypeptide. A preferredcell to culture is a recombinant cell of the present invention.Effective culture conditions include, but are not limited to, effectivemedia, bioreactor, temperature, pH and oxygen conditions that permitpolypeptide production. An effective medium refers to any medium inwhich a cell is cultured to produce a polypeptide of the presentinvention. Such medium typically comprises an aqueous medium havingassimilable carbon, nitrogen and phosphate sources, and appropriatesalts, minerals, metals and other nutrients, such as vitamins. Cells ofthe present invention can be cultured in conventional fermentationbioreactors, shake flasks, test tubes, microtiter dishes, and petriplates. Culturing can be carried out at a temperature, pH and oxygencontent appropriate for a recombinant cell. Such culturing conditionsare within the expertise of one of ordinary skill in the art.

Polynucleotides

By an “isolated polynucleotide”, including DNA, RNA, or a combination ofthese, single or double stranded, in the sense or antisense orientationor a combination of both, dsRNA or otherwise, we mean a polynucleotidewhich is at least partially separated from the polynucleotide sequenceswith which it is associated or linked in its native state. Preferably,the isolated polynucleotide is at least 60% free, preferably at least75% free, and most preferably at least 90% free from other componentswith which they are naturally associated. Furthermore, the term“polynucleotide” is used interchangeably herein with the term “nucleicacid”.

The term “exogenous” in the context of a polynucleotide refers to thepolynucleotide when present in a cell, or in a cell-free expressionsystem, in an altered amount compared to its native state. In oneembodiment, the cell is a cell that does not naturally comprise thepolynucleotide. However, the cell may be a cell which comprises anon-endogenous polynucleotide resulting in an altered, preferablyincreased, amount of production of the encoded polypeptide. An exogenouspolynucleotide of the invention includes polynucleotides which have notbeen separated from other components of the transgenic (recombinant)cell, or cell-free expression system, in which it is present, andpolynucleotides produced in such cells or cell-free systems which aresubsequently purified away from at least some other components.

The % identity of a polynucleotide is determined by GAP (Needleman andWunsch, 1970) analysis (GCG program) with a gap creation penalty=5, anda gap extension penalty=0.3. Unless stated otherwise, the query sequenceis at least 45 nucleotides in length, and the GAP analysis aligns thetwo sequences over a region of at least 45 nucleotides. Preferably, thequery sequence is at least 150 nucleotides in length, and the GAPanalysis aligns the two sequences over a region of at least 150nucleotides. More preferably, the query sequence is at least 300nucleotides in length and the GAP analysis aligns the two sequences overa region of at least 300 nucleotides. Even more preferably, the GAPanalysis aligns the two sequences over their entire length.

With regard to the defined polynucleotides, it will be appreciated that% identity figures higher than those provided above will encompasspreferred embodiments. Thus, where applicable, in light of the minimum %identity figures, it is preferred that a polynucleotide of the inventioncomprises a sequence which is at least 40%, more preferably at least45%, more preferably at least 50%, more preferably at least 55%, morepreferably at least 60%, more preferably at least 65%, more preferablyat least 70%, more preferably at least 75%, more preferably at least80%, more preferably at least 85%, more preferably at least 90%, morepreferably at least 91%, more preferably at least 92%, more preferablyat least 93%, more preferably at least 94%, more preferably at least95%, more preferably at least 96%, more preferably at least 97%, morepreferably at least 98%, more preferably at least 99%, more preferablyat least 99.1%, more preferably at least 99.2%, more preferably at least99.3%, more preferably at least 99.4%, more preferably at least 99.5%,more preferably at least 99.6%, more preferably at least 99.7%, morepreferably at least 99.8%, and even more preferably at least 99.9%identical to the relevant nominated SEQ ID NO.

Polynucleotides of the present invention may possess, when compared tonaturally occurring molecules, one or more mutations which aredeletions, insertions, or substitutions of nucleotide residues. Mutantscan be either naturally occurring (that is to say, isolated from anatural source) or synthetic (for example, by performing site-directedmutagenesis on the nucleic acid).

Oligonucleotides and/or polynucleotides of the invention hybridize to asilk gene of the present invention, or a region flanking said gene,under stringent conditions. The term “stringent hybridizationconditions” and the like as used herein refers to parameters with whichthe art is familiar, including the variation of the hybridizationtemperature with length of an oligonucleotide. Nucleic acidhybridization parameters may be found in references which compile suchmethods, Sambrook, et al. (supra), and Ausubel, et al. (supra). Forexample, stringent hybridization conditions, as used herein, can referto hybridization at 65° C. in hybridization buffer (3.5×SSC, 0.02%.Ficoll, 0.02% polyvinyl pyrrolidone, 0.02% Bovine Serum Albumin (BSA),2.5 mM NaH₂PO₄ (pH7), 0.5% SDS, 2 mM EDTA), followed by one or morewashes in 0.2×SSC, 0.01% BSA at 50° C. Alternatively, the nucleic acidand/or oligonucleotides (which may also be referred to as “primers” or“probes”) hybridize to the region of the an insect genome of interest,such as the genome of a honeybee, under conditions used in nucleic acidamplification techniques such as PCR.

Oligonucleotides of the present invention can be RNA, DNA, orderivatives of either. Although the terms polynucleotide andoligonucleotide have overlapping meaning, oligonucleotides are typicallyrelatively short single stranded molecules. The minimum size of sucholigonucleotides is the size required for the formation of a stablehybrid between an oligonucleotide and a complementary sequence on atarget nucleic acid molecule. Preferably, the oligonucleotides are atleast 15 nucleotides, more preferably at least 18 nucleotides, morepreferably at least 19 nucleotides, more preferably at least 20nucleotides, even more preferably at least 25 nucleotides in length.

Usually, monomers of a polynucleotide or oligonucleotide are linked byphosphodiester bonds or analogs thereof to form oligonucleotides rangingin size from a relatively short monomeric units, e.g., 12-18, to severalhundreds of monomeric units. Analogs of phosphodiester linkages include:phosphorothioate, phosphorodithioate, phosphoroselenoate,phosphorodiselenoate, phosphoroanilothioate, phosphoranilidate,phosphoramidate.

The present invention includes oligonucleotides that can be used as, forexample, probes to identify nucleic acid molecules, or primers toproduce nucleic acid molecules. Oligonucleotides of the presentinvention used as a probe are typically conjugated with a detectablelabel such as a radioisotope, an enzyme, biotin, a fluorescent moleculeor a chemiluminescent molecule.

Recombinant Vectors

One embodiment of the present invention includes a recombinant vector,which comprises at least one isolated polynucleotide molecule of thepresent invention, inserted into any vector capable of delivering thepolynucleotide molecule into a host cell. Such a vector containsheterologous polynucleotide sequences, that is polynucleotide sequencesthat are not naturally found adjacent to polynucleotide molecules of thepresent invention and that preferably are derived from a species otherthan the species from which the polynucleotide molecule(s) are derived.The vector can be either RNA or DNA, either prokaryotic or eukaryotic,and typically is a transposon (such as described in U.S. Pat. No.5,792,294), a virus or a plasmid.

One type of recombinant vector comprises a polynucleotide molecule ofthe present invention operatively linked to an expression vector. Thephrase operatively linked refers to insertion of a polynucleotidemolecule into an expression vector in a manner such that the molecule isable to be expressed when transformed into a host cell. As used herein,an expression vector is a DNA or RNA vector that is capable oftransforming a host cell and of effecting expression of a specifiedpolynucleotide molecule. Preferably, the expression vector is alsocapable of replicating within the host cell. Expression vectors can beeither prokaryotic or eukaryotic, and are typically viruses or plasmids.Expression vectors of the present invention include any vectors thatfunction (i.e., direct gene expression) in recombinant cells of thepresent invention, including in bacterial, fungal, endoparasite,arthropod, animal, and plant cells. Particularly preferred expressionvectors of the present invention can direct gene expression in plantscells. Vectors of the invention can also be used to produce thepolypeptide in a cell-free expression system, such systems are wellknown in the art.

In particular, expression vectors of the present invention containregulatory sequences such as transcription control sequences,translation control sequences, origins of replication, and otherregulatory sequences that are compatible with the recombinant cell andthat control the expression of polynucleotide molecules of the presentinvention. In particular, recombinant molecules of the present inventioninclude transcription control sequences. Transcription control sequencesare sequences which control the initiation, elongation, and terminationof transcription. Particularly important transcription control sequencesare those which control transcription initiation, such as promoter,enhancer, operator and repressor sequences. Suitable transcriptioncontrol sequences include any transcription control sequence that canfunction in at least one of the recombinant cells of the presentinvention. A variety of such transcription control sequences are knownto those skilled in the art. Preferred transcription control sequencesinclude those which function in bacterial, yeast, arthropod, plant ormammalian cells, such as, but not limited to, tac, lac, trp, trc,oxy-pro, omp/lpp, rrnB, bacteriophage lambda, bacteriophage T7, T7lac,bacteriophage T3, bacteriophage SP6, bacteriophage SP01,metallothionein, alpha-mating factor, Pichia alcohol oxidase, alphavirussubgenomic promoters (such as Sindbis virus subgenomic promoters),antibiotic resistance gene, baculovirus, Heliothis zea insect virus,vaccinia virus, herpesvirus, raccoon poxvirus, other poxvirus,adenovirus, cytomegalovirus (such as intermediate early promoters),simian virus 40, retrovirus, actin, retroviral long terminal repeat,Rous sarcoma virus, heat shock, phosphate and nitrate transcriptioncontrol sequences as well as other sequences capable of controlling geneexpression in prokaryotic or eukaryotic cells.

Particularly preferred transcription control sequences are promotersactive in directing transcription in plants, either constitutively orstage and/or tissue specific, depending on the use of the plant or partsthereof. These plant promoters include, but are not limited to,promoters showing constitutive expression, such as the 35S promoter ofCauliflower Mosaic Virus (CaMV), those for leaf-specific expression,such as the promoter of the ribulose bisphosphate carboxylase smallsubunit gene, those for root-specific expression, such as the promoterfrom the glutamine synthase gene, those for seed-specific expression,such as the cruciferin A promoter from Brassica napus, those fortuber-specific expression, such as the class-I patatin promoter frompotato or those for fruit-specific expression, such as thepolygalacturonase (PG) promoter from tomato.

Recombinant molecules of the present invention may also (a) containsecretory signals (i.e., signal segment nucleic acid sequences) toenable an expressed polypeptide of the present invention to be secretedfrom the cell that produces the polypeptide and/or (b) contain fusionsequences which lead to the expression of nucleic acid molecules of thepresent invention as fusion proteins. Examples of suitable signalsegments include any signal segment capable of directing the secretionof a polypeptide of the present invention. Preferred signal segmentsinclude, but are not limited to, tissue plasminogen activator (t-PA),interferon, interleukin, growth hormone, viral envelope glycoproteinsignal segments, Nicotiana nectarin signal peptide (U.S. Pat. No.5,939,288), tobacco extensin signal, the soy oleosin oil body bindingprotein signal, Arabidopsis thaliana vacuolar basic chitinase signalpeptide, as well as native signal sequences of a polypeptide of theinvention. In addition, a nucleic acid molecule of the present inventioncan be joined to a fusion segment that directs the encoded polypeptideto the proteosome, such as a ubiquitin fusion segment. Recombinantmolecules may also include intervening and/or untranslated sequencessurrounding and/or within the nucleic acid sequences of the presentinvention.

Host Cells

Another embodiment of the present invention includes a recombinant cellcomprising a host cell transformed with one or more recombinantmolecules of the present invention, or progeny cells thereof.Transformation of a polynucleotide molecule into a cell can beaccomplished by any method by which a polynucleotide molecule can beinserted into the cell. Transformation techniques include, but are notlimited to, transfection, electroporation, microinjection, lipofection,adsorption, and protoplast fusion. A recombinant cell may remainunicellular or may grow into a tissue, organ or a multicellularorganism. Transformed polynucleotide molecules of the present inventioncan remain extrachromosomal or can integrate into one or more siteswithin a chromosome of the transformed (i.e., recombinant) cell in sucha manner that their ability to be expressed is retained.

Suitable host cells to transform include any cell that can betransformed with a polynucleotide of the present invention. Host cellsof the present invention either can be endogenously (i.e., naturally)capable of producing polypeptides of the present invention or can becapable of producing such polypeptides after being transformed with atleast one polynucleotide molecule of the present invention. Host cellsof the present invention can be any cell capable of producing at leastone protein of the present invention, and include bacterial, fungal(including yeast), parasite, arthropod, animal and plant cells. Examplesof host cells include Salmonella, Escherichia, Bacillus, Listeria,Saccharomyces, Spodoptera, Mycobacterta, Trichoplusia, BHK (baby hamsterkidney) cells, MDCK cells, CRFK cells, CV-1 cells, COS (e.g., COS-7)cells, and Vero cells. Further examples of host cells are E. coli,including E. coli K-12 derivatives; Salmonella typhi; Salmonellatyphimurium, including attenuated strains; Spodoptera frugiperda;Trichoplusia ni; and non-tumorigenic mouse myoblast G8 cells (e.g., ATCCCRL 1246). Additional appropriate mammalian cell hosts include otherkidney cell lines, other fibroblast cell lines (e.g., human, murine orchicken embryo fibroblast cell lines), myeloma cell lines, Chinesehamster ovary cells, mouse NIH/3T3 cells, LMTK cells and/or HeLa cells.Particularly preferred host cells are plant cells such as thoseavailable from Deutsche Sammlung von Mikroorganismen und ZellkulturenGmbH (German Collection of Microorganisms and Cell Cultures).

Recombinant DNA technologies can be used to improve expression of atransformed polynucleotide molecule by manipulating, for example, thenumber of copies of the polynucleotide molecule within a host cell, theefficiency with which those polynucleotide molecules are transcribed,the efficiency with which the resultant transcripts are translated, andthe efficiency of post-translational modifications. Recombinanttechniques useful for increasing the expression of polynucleotidemolecules of the present invention include, but are not limited to,operatively linking polynucleotide molecules to high-copy numberplasmids, integration of the polynucleotide molecule into one or morehost cell chromosomes, addition of vector stability sequences toplasmids, substitutions or modifications of transcription controlsignals (e.g., promoters, operators, enhancers), substitutions ormodifications of translational control signals (e.g., ribosome bindingsites, Shine-Dalgarno sequences), modification of polynucleotidemolecules of the present invention to correspond to the codon usage ofthe host cell, and the deletion of sequences that destabilizetranscripts.

Transgenic Plants

The term “plant” refers to whole plants, plant organs (e.g. leaves,stems roots, etc), seeds, plant cells and the like. Plants contemplatedfor use in the practice of the present invention include bothmonocotyledons and dicotyledons. Target plants include, but are notlimited to, the following: cereals (wheat, barley, rye, oats, rice,sorghum and related crops); beet (sugar beet and fodder beet); pomes,stone fruit and soft fruit (apples, pears, plums, peaches, almonds,cherries, strawberries, raspberries and black-berries); leguminousplants (beans, lentils, peas, soybeans); oil plants (rape, mustard,poppy, olives, sunflowers, coconut, castor oil plants, cocoa beans,groundnuts); cucumber plants (marrows, cucumbers, melons); fibre plants(cotton, flax, hemp, jute); citrus fruit (oranges, lemons, grapefruit,mandarins); vegetables (spinach, lettuce, asparagus, cabbages, carrots,onions, tomatoes, potatoes, paprika); lauraceae (avocados, cinnamon,camphor); or plants such as maize, tobacco, nuts, coffee, sugar cane,tea, vines, hops, turf, bananas and natural rubber plants, as well asornamentals (flowers, shrubs, broad-leaved trees and evergreens, such asconifers).

Transgenic plants, as defined in the context of the present inventioninclude plants (as well as parts and cells of said plants) and theirprogeny which have been genetically modified using recombinanttechniques to cause production of at least one polypeptide of thepresent invention in the desired plant or plant organ. Transgenic plantscan be produced using techniques known in the art, such as thosegenerally described in A. Slater et al., Plant Biotechnology—The GeneticManipulation of Plants, Oxford University Press (2003), and P. Christouand H. Klee, Handbook of Plant Biotechnology, John Wiley and Sons(2004).

A polynucleotide of the present invention may be expressedconstitutively in the transgenic plants during all stages ofdevelopment. Depending on the use of the plant or plant organs, thepolypeptides may be expressed in a stage-specific manner. Furthermore,the polynucleotides may be expressed tissue-specifically.

Regulatory sequences which are known or are found to cause expression ofa gene encoding a polypeptide of interest in plants may be used in thepresent invention. The choice of the regulatory sequences used dependson the target plant and/or target organ of interest. Such regulatorysequences may be obtained from plants or plant viruses, or may bechemically synthesized. Such regulatory sequences are well known tothose skilled in the art.

Constitutive plant promoters are well known. Further to previouslymentioned promoters, some other suitable promoters include but are notlimited to the nopaline synthase promoter, the octopine synthasepromoter, CaMV 35S promoter, the ribulose-1,5-bisphosphate carboxylasepromoter, Adh1-based pEmu, Act1, the SAM synthase promoter and Ubipromoters and the promoter of the chlorophyll a/b binding protein.Alternatively it may be desired to have the transgene(s) expressed in aregulated fashion. Regulated expression of the polypeptides is possibleby placing the coding sequence of the silk protein under the control ofpromoters that are tissue-specific, developmental-specific, orinducible. Several tissue-specific regulated genes and/or promoters havebeen reported in plants. These include genes encoding the seed storageproteins (such as napin, cruciferin, β-conglycinin, glycinin andphaseolin), zein or oil body proteins (such as oleosin), or genesinvolved in fatty acid biosynthesis (including acyl carrier protein,stearoyl-ACP desaturase, and fatty acid desaturases (fad 2-1)), andother genes expressed during embryo development (such as Bce4).Particularly useful for seed-specific expression is the pea vicilinpromoter. Other useful promoters for expression in mature leaves arethose that are switched on at the onset of senescence, such as the SAGpromoter from Arabidopsis). A class of fruit-specific promotersexpressed at or during anthesis through fruit development, at leastuntil the beginning of ripening, is discussed in U.S. Pat. No.4,943,674. Other examples of tissue-specific promoters include thosethat direct expression in tubers (for example, patatin gene promoter),and in fiber cells (an example of a developmentally-regulated fiber cellprotein is E6 fiber).

Other regulatory sequences such as terminator sequences andpolyadenylation signals include any such sequence functioning as such inplants, the choice of which would be obvious to the skilled addressee.The termination region used in the expression cassette will be chosenprimarily for convenience, since the termination regions appear to berelatively interchangeable. The termination region which is used may benative with the transcriptional initiation region, may be native withthe polynucleotide sequence of interest, or may be derived from anothersource. The termination region may be naturally occurring, or wholly orpartially synthetic. Convenient termination regions are available fromthe Ti-plasmid of A. tumefaciens, such as the octopine synthase andnopaline synthase termination regions or from the genes for β-phaseolin,the chemically inducible lant gene, pIN.

Several techniques are available for the introduction of an expressionconstruct containing a nucleic acid sequence encoding a polypeptide ofinterest into the target plants. Such techniques include but are notlimited to transformation of protoplasts using the calcium/polyethyleneglycol method, electroporation and microinjection or (coated) particlebombardment. In addition to these so-called direct DNA transformationmethods, transformation systems involving vectors are widely available,such as viral and bacterial vectors (e.g. from the genus Agrobacterium).After selection and/or screening, the protoplasts, cells or plant partsthat have been transformed can be regenerated into whole plants, usingmethods known in the art. The choice of the transformation and/orregeneration techniques is not critical for this invention.

To confirm the presence of the transgenes in transgenic cells andplants, a polymerase chain reaction (PCR) amplification or Southern blotanalysis can be performed using methods known to those skilled in theart. Expression products of the transgenes can be detected in any of avariety of ways, depending upon the nature of the product, and includeWestern blot and enzyme assay. One particularly useful way to quantitateprotein expression and to detect replication in different plant tissuesis to use a reporter gene, such as GUS. Once transgenic plants have beenobtained, they may be grown to produce plant tissues or parts having thedesired phenotype. The plant tissue or plant parts, may be harvested,and/or the seed collected. The seed may serve as a source for growingadditional plants with tissues or parts having the desiredcharacteristics.

Transgenic Hon-Human Animals

Techniques for producing transgenic animals are well known in the art. Auseful general textbook on this subject is Houdebine, Transgenicanimals—Generation and Use (Harwood Academic, 1997).

Heterologous DNA can be introduced, for example, into fertilizedmammalian ova. For instance, totipotent or pluripotent stem cells can betransformed by microinjection, calcium phosphate mediated precipitation,liposome fusion, retroviral infection or other means, the transformedcells are then introduced into the embryo, and the embryo then developsinto a transgenic animal. In a highly preferred method, developingembryos are infected with a retrovirus containing the desired DNA, andtransgenic animals produced from the infected embryo. In a mostpreferred method, however, the appropriate DNAs are coinjected into thepronucleus or cytoplasm of embryos, preferably at the single cell stage,and the embryos allowed to develop into mature transgenic animals.

Another method used to produce a transgenic animal involvesmicroinjecting a nucleic acid into pro-nuclear stage eggs by standardmethods. Injected eggs are then cultured before transfer into theoviducts of pseudopregnant recipients.

Transgenic animals may also be produced by nuclear transfer technology.Using this method, fibroblasts from donor animals are stably transfectedwith a plasmid incorporating the coding sequences for a binding domainor binding partner of interest under the control of regulatorysequences. Stable transfectants are then fused to enucleated oocytes,cultured and transferred into female recipients.

Recovery Methods and Production of Silk

The silk proteins of the present invention may be extracted and purifiedfrom recombinant cells, such as plant, bacteria or yeast cells,producing said protein by a variety of methods. In one embodiment, themethod involves removal of native cell proteins from homogenizedcells/tissues/plants etc. by lowering pH and heating, followed byammonium sulfate fractionation. Briefly, total soluble proteins areextracted by homogenizing cells/tissues/plants. Native proteins areremoved by precipitation at pH 4.7 and then at 60° C. The resultingsupernatant is then fractionated with ammonium sulfate at 40%saturation. The resulting protein will be of the order of 95% pure.Additional purification may be achieved with conventional gel oraffinity chromatography.

In another example, cell lysates are treated with high concentrations ofacid e.g. HCl or propionic acid to reduce pH to ˜1-2 for 1 hour or morewhich will solubilise the silk proteins but precipitate other proteins.

Fibrillar aggregates will form from solutions by spontaneousself-assembly of silk proteins of the invention when the proteinconcentration exceeds a critical value. The aggregates may be gatheredand mechanically spun into macroscopic fibers according to the method ofO'Brien et al. (I. O'Brien et al., “Design, Synthesis and Fabrication ofNovel Self-Assembling Fibrillar Proteins”, in Silk Polymers: MaterialsScience and Biotechnology, pp. 104-117, Kaplan, Adams, Farmer and Viney,eds., c. 1994 by American Chemical Society, Washington, D.C.).

By nature of the inherent coiled coil secondary structure, proteins suchas Xenospira1-4, BBF1-4, BAF1-4 and GAF1-4 will spontaneously form thecoiled coil secondary structure upon dehydration. As described below,the strength of the coiled coil can be enhanced through enzymatic orchemical cross-linking of lysine residues in close proximity.

Silk fibres and/or copolymers of the invention have a low processingrequirement. The silk proteins of the invention require minimalprocessing e.g. spinning to form a strong fibre as they spontaneouslyforms strong coiled coils which can be reinforced with crosslinks suchas lysine crosslinks. This contrasts with B. mori and spider recombinantsilk polypeptides which require sophisticated spinning techniques inorder to obtain the secondary structure (β-sheet) and strength of thefibre.

However, fibers may be spun from solutions having propertiescharacteristic of a liquid crystal phase. The fiber concentration atwhich phase transition can occur is dependent on the composition of aprotein or combination of proteins present in the solution. Phasetransition, however, can be detected by monitoring the clarity andbirefringence of the solution. Onset of a liquid crystal phase can bedetected when the solution acquires a translucent appearance andregisters birefringence when viewed through crossed polarizing filters.

In one fiber-forming technique, fibers can first be extruded from theprotein solution through an orifice into methanol, until a lengthsufficient to be picked up by a mechanical means is produced. Then afiber can be pulled by such mechanical means through a methanolsolution, collected, and dried. Methods for drawing fibers areconsidered well-known in the art.

Further examples of methods which may be used for producing silk fibresand/or copolymers of the present are described in US 2004/0170827 and US2005/0054830.

In a preferred embodiment, silk fibres and/or copolymers of theinvention are crosslinked. In one embodiment, the silk fibres and/orcopolymers are crosslinked to a surface/article/product etc of interestusing techniques known in the art. In another embodiment (or incombination with the previous embodiment), at least some silk proteinsin the silk fibres and/or copolymers are crosslinked to each other.Preferably, the silk proteins are crosslinked via lysine residues in theproteins. Such crosslinking can be performed using chemical and/orenzymatic techniques known in the art. For example, enzymatic crosslinks can be catalysed by lysyl oxidase, whereas nonenzymatic crosslinks can be generated from glycated lysine residues (Reiser et al.,1992).

Antibodies

The invention also provides monoclonal or polyclonal antibodies topolypeptides of the invention or fragments thereof. Thus, the presentinvention further provides a process for the production of monoclonal orpolyclonal antibodies to polypeptides of the invention.

The term “binds specifically” refers to the ability of the antibody tobind to at least one polypeptide of the present invention but not otherknown silk proteins.

As used herein, the term “epitope” refers to a region of a polypeptideof the invention which is bound by the antibody. An epitope can beadministered to an animal to generate antibodies against the epitope,however, antibodies of the present invention preferably specificallybind the epitope region in the context of the entire polypeptide.

If polyclonal antibodies are desired, a selected mammal (e.g., mouse,rabbit, goat, horse, etc.) is immunised with an immunogenic polypeptideof the invention. Serum from the immunised animal is collected andtreated according to known procedures. If serum containing polyclonalantibodies contains antibodies to other antigens, the polyclonalantibodies can be purified by immunoaffinity chromatography. Techniquesfor producing and processing polyclonal antisera are known in the art.In order that such antibodies may be made, the invention also providespolypeptides of the invention or fragments thereof haptenised to anotherpolypeptide for use as immunogens in animals.

Monoclonal antibodies directed against polypeptides of the invention canalso be readily produced by one skilled in the art. The generalmethodology for making monoclonal antibodies by hybridomas is wellknown. Immortal antibody-producing cell lines can be created by cellfusion, and also by other techniques such as direct transformation of Blymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus.Panels of monoclonal antibodies produced can be screened for variousproperties; i.e., for isotype and epitope affinity.

An alternative technique involves screening phage display librarieswhere, for example the phage express scFv fragments on the surface oftheir coat with a large variety of complementarity determining regions(CDRs). This technique is well known in the art.

For the purposes of this invention, the term “antibody”, unlessspecified to the contrary, includes fragments of whole antibodies whichretain their binding activity for a target antigen. Such fragmentsinclude Fv, F(ab′) and F(ab′)₂ fragments, as well as single chainantibodies (scFv). Furthermore, the antibodies and fragments thereof maybe humanised antibodies, for example as described in EP-A-239400.

Antibodies of the invention may be bound to a solid support and/orpackaged into kits in a suitable container along with suitable reagents,controls, instructions and the like.

Preferably, antibodies of the present invention are detectably labeled.Exemplary detectable labels that allow for direct measurement ofantibody binding include radiolabels, fluorophores, dyes, magneticbeads, chemiluminescers, colloidal particles, and the like. Examples oflabels which permit indirect measurement of binding include enzymeswhere the substrate may provide for a coloured or fluorescent product.Additional exemplary detectable labels include covalently bound enzymescapable of providing a detectable product signal after addition ofsuitable substrate. Examples of suitable enzymes for use in conjugatesinclude horseradish peroxidase, alkaline phosphatase, malatedehydrogenase and the like. Where not commercially available, suchantibody-enzyme conjugates are readily produced by techniques known tothose skilled in the art. Further exemplary detectable labels includebiotin, which binds with high affinity to avidin or streptavidin;fluorochromes (e.g., phycobiliproteins, phycoerythrin andallophycocyanins; fluorescein and Texas red), which can be used with afluorescence activated cell sorter; haptens; and the like. Preferably,the detectable label allows for direct measurement in a plateluminometer, e.g., biotin. Such labeled antibodies can be used intechniques known in the art to detect polypeptides of the invention.

Compositions

Compositions of the present invention may include an “acceptablecarrier”. Examples of such acceptable carriers include water, saline,Ringer's solution, dextrose solution, Hank's solution, and other aqueousphysiologically balanced salt solutions. Nonaqueous vehicles, such asfixed oils, sesame oil, ethyl oleate, or triglycerides may also be used.

In one embodiment, the “acceptable carrier” is a “pharmaceuticallyacceptable carrier”. The term pharmaceutically acceptable carrier refersto molecular entities and compositions that do not produce an allergic,toxic or otherwise adverse reaction when administered to an animal,particularly a mammal, and more particularly a human. Useful examples ofpharmaceutically acceptable carriers or diluents include, but are notlimited to, solvents, dispersion media, coatings, stabilizers,protective colloids, adhesives, thickeners, thixotropic agents,penetration agents, sequestering agents and isotonic and absorptiondelaying agents that do not affect the activity of the polypeptides ofthe invention. The proper fluidity can be maintained, for example, bythe use of a coating, such as lecithin, by the maintenance of therequired particle size in the case of dispersion and by the use ofsurfactants. More generally, the polypeptides of the invention can becombined with any non-toxic solid or liquid additive corresponding tothe usual formulating techniques.

As outlined herein, in some embodiments a polypeptide, a silk fiberand/or a copolymer of the invention is used as a pharmaceuticallyacceptable carrier.

Other suitable compositions are described below with specific referenceto specific uses of the polypeptides of the invention.

Uses

Silk proteins are useful for the creation of new biomaterials because oftheir exceptional toughness and strength. However, to date the fibrousproteins of spiders and insects are large proteins (over 100 kDa) andconsist of highly repetitive amino acid sequences. These proteins areencoded by large genes containing highly biased codons making themparticularly difficult to produce in recombinant systems. By comparison,the silk proteins of the invention are short and non-repetitive. Theseproperties make the genes encoding these proteins particularlyattractive for recombinant production of new biomaterials.

The silk proteins, silk fibers and/or copolymers of the invention can beused for a broad and diverse array of medical, military, industrial andcommercial applications. The fibers can be used in the manufacture ofmedical devices such as sutures, skin grafts, cellular growth matrices,replacement ligaments, and surgical mesh, and in a wide range ofindustrial and commercial products, such as, for example, cable, rope,netting, fishing line, clothing fabric, bullet-proof vest lining,container fabric, backpacks, knapsacks, bag or purse straps, adhesivebinding material, non-adhesive binding material, strapping material,tent fabric, tarpaulins, pool covers, vehicle covers, fencing material,sealant, construction material, weatherproofing material, flexiblepartition material, sports equipment; and, in fact, in nearly any use offiber or fabric for which high tensile strength and elasticity aredesired characteristics. The silk proteins, silk fibers and/orcopolymers of the present invention also have applications incompositions for personal care products such as cosmetics, skin care,hair care and hair colouring; and in coating of particles, such aspigments.

The silk proteins may be used in their native form or they may bemodified to form derivatives, which provide a more beneficial effect.For example, the silk protein may be modified by conjugation to apolymer to reduce allergenicity as described in U.S. Pat. No. 5,981,718and U.S. Pat. No. 5,856,451. Suitable modifying polymers include, butare not limited to, polyalkylene oxides, polyvinyl alcohol,poly-carboxylates, poly(vinylpyrrolidone), and dextrans. In anotherexample, the silk proteins may be modified by selective digestion andsplicing of other protein modifiers. For example, the silk proteins maybe cleaved into smaller peptide units by treatment with acid at anelevated temperature of about 60° C. The useful acids include, but arenot limited to, dilute hydrochloric, sulfuric or phosphoric acids.Alternatively, digestion of the silk proteins may be done by treatmentwith a base, such as sodium hydroxide, or enzymatic digestion using asuitable protease may be used.

The proteins may be further modified to provide performancecharacteristics that are beneficial in specific applications forpersonal care products. The modification of proteins for use in personalcare products is well known in the art. For example, commonly usedmethods are described in U.S. Pat. No. 6,303,752, U.S. Pat. No.6,284,246, and U.S. Pat. No. 6,358,501. Examples of modificationsinclude, but are not limited to, ethoxylation to promote water-oilemulsion enhancement, siloxylation to provide lipophilic compatibility,and esterification to aid in compatibility with soap and detergentcompositions. Additionally, the silk proteins may be derivatized withfunctional groups including, but not limited to, amines, oxiranes,cyanates, carboxylic acid esters, silicone copolyols, siloxane esters,quaternized amine aliphatics, urethanes, polyacrylamides, dicarboxylicacid esters, and halogenated esters. The silk proteins may also bederivatized by reaction with diimines and by the formation of metalsalts.

Consistent with the above definitions of “polypeptide” (and “protein”),such derivatized and/or modified molecules are also referred to hereinbroadly as “polypeptides” and “proteins”.

Silk proteins of the invention can be spun together and/or bundled orbraided with other fiber types. Examples include, but are not limitedto, polymeric fibers (e.g., polypropylene, nylon, polyester), fibers andsilks of other plant and animal sources (e.g., cotton, wool, Bombyx morior spider silk), and glass fibers. A preferred embodiment is silk fiberbraided with 10% polypropylene fiber. The present invention contemplatesthat the production of such combinations of fibers can be readilypracticed to enhance any desired characteristics, e.g., appearance,softness, weight, durability, water-repellant properties, improvedcost-of-manufacture, that may be generally sought in the manufacture andproduction of fibers for medical, industrial, or commercialapplications.

Personal Care Products

Cosmetic and skin care compositions may be anhydrous compositionscomprising an effective amount of silk protein in a cosmeticallyacceptable medium. The uses of these compositions include, but are notlimited to, skin care, skin cleansing, make-up, and anti-wrinkleproducts. An effective amount of a silk protein for cosmetic and skincare compositions is herein defined as a proportion of from about 10⁻⁴to about 30% by weight, but preferably from about 10⁻³ to 15% by weight,relative to the total weight of the composition. This proportion mayvary as a function of the type of cosmetic or skin care composition.Suitable compositions for a cosmetically acceptable medium are describedin U.S. Pat. No. 6,280,747. For example, the cosmetically acceptablemedium may contain a fatty substance in a proportion generally of fromabout 10 to about 90% by weight relative to the total weight of thecomposition, where the fatty phase containing at least one liquid, solidor semi-solid fatty substance. The fatty substance includes, but is notlimited to, oils, waxes, gums, and so-called pasty fatty substances.Alternatively, the compositions may be in the form of a stabledispersion such as a water-in-oil or oil-in-water emulsion.Additionally, the compositions may contain one or more conventionalcosmetic or dermatological additives or adjuvants, including but notlimited to, antioxidants, preserving agents, fillers, surfactants, UVAand/or UVB sunscreens, fragrances, thickeners, wetting agents andanionic, nonionic or amphoteric polymers, and dyes or pigments.

Emulsified cosmetics and quasi drugs which are producible with the useof emulsified materials comprising at least one silk protein of thepresent invention include, for example, cleansing cosmetics (beautysoap, facial wash, shampoo, rinse, and the like), hair care products(hair dye, hair cosmetics, and the like), basic cosmetics (generalcream, emulsion, shaving cream, conditioner, cologne, shaving lotion,cosmetic oil, facial mask, and the like), make-up cosmetics (foundation,eyebrow pencil, eye cream, eye shadow, mascara, and the like), aromaticcosmetics (perfume and the like), tanning and sunscreen cosmetics(tanning and sunscreen cream, tanning and sunscreen lotion, tanning andsunscreen oil, and the like), nail cosmetics (nail cream and the like),eyeliner cosmetics (eyeliner and the like), lip cosmetics (lipstick, lipcream, and the like), oral care products (tooth paste and the like) bathcosmetics (bath products and the like), and the like.

The cosmetic composition may also be in the form of products for nailcare, such as a nail varnish. Nail varnishes are herein defined ascompositions for the treatment and colouring of nails, comprising aneffective amount of silk protein in a cosmetically acceptable medium. Aneffective amount of a silk protein for use in a nail varnish compositionis herein defined as a proportion of from about 10⁻⁴ to about 30% byweight relative to the total weight of the varnish. Components of acosmetically acceptable medium for nail varnishes are described in U.S.Pat. No. 6,280,747. The nail varnish typically contains a solvent and afilm forming substance, such as cellulose derivatives, polyvinylderivatives, acrylic polymers or copolymers, vinyl copolymers andpolyester polymers. The composition may also contain an organic orinorganic pigment.

Hair care compositions are herein defined as compositions for thetreatment of hair, including but not limited to shampoos, conditioners,lotions, aerosols, gels, and mousses, comprising an effective amount ofsilk protein in a cosmetically acceptable medium. An effective amount ofa silk protein for use in a hair care composition is herein defined as aproportion of from about 10⁻² to about 90% by weight relative to thetotal weight of the composition. Components of a cosmetically acceptablemedium for hair care compositions are described in US 2004/0170590, U.S.Pat. No. 6,280,747, U.S. Pat. No. 6,139,851, and U.S. Pat. No.6,013,250. For example, these hair care compositions can be aqueous,alcoholic or aqueous-alcoholic solutions, the alcohol preferably beingethanol or isopropanol, in a proportion of from about 1 to about 75% byweight relative to the total weight, for the aqueous-alcoholicsolutions. Additionally, the hair care compositions may contain one ormore conventional cosmetic or dermatological additives or adjuvants, asgiven above.

Hair colouring compositions are herein defined as compositions for thecolouring, dyeing, or bleaching of hair, comprising an effective amountof silk protein in a cosmetically acceptable medium. An effective amountof a silk protein for use in a hair colouring composition is hereindefined as a proportion of from about 10⁻⁴ to about 60% by weightrelative to the total weight of the composition. Components of acosmetically acceptable medium for hair colouring compositions aredescribed in US 2004/0170590, U.S. Pat. No. 6,398,821 and U.S. Pat. No.6,129,770. For example, hair colouring compositions generally contain amixture of inorganic peroxygen-based dye oxidizing agent and anoxidizable coloring agent. The peroxygen-based dye oxidizing agent ismost commonly hydrogen peroxide. The oxidative hair coloring agents areformed by oxidative coupling of primary intermediates (for examplep-phenylenediamines, p-aminophenols, p-diaminopyridines, hydroxyindoles,aminoindoles, aminothymidines, or cyanophenols) with secondaryintermediates (for example phenols, resorcinols, m-aminophenols,m-phenylenediamines, naphthols, pyrazolones, hydroxyindoles, catecholsor pyrazoles). Additionally, hair colouring compositions may containoxidizing acids, sequestrants, stabilizers, thickeners, bufferscarriers, surfactants, solvents, antioxidants, polymers, non-oxidativedyes and conditioners.

The silk proteins can also be used to coat pigments and cosmeticparticles in order to improve dispersibility of the particles for use incosmetics and coating compositions. Cosmetic particles are hereindefined as particulate materials such as pigments or inert particlesthat are used in cosmetic compositions. Suitable pigments and cosmeticparticles, include, but are not limited to, inorganic color pigments,organic pigments, and inert particles. The inorganic color pigmentsinclude, but are not limited to, titanium dioxide, zinc oxide, andoxides of iron, magnesium, cobalt, and aluminium. Organic pigmentsinclude, but are not limited to, D&C Red No. 36, D&C Orange No. 17, thecalcium lakes of D&C Red Nos. 7, 11, 31 and 34, the barium lake of D&CRed No. 12, the strontium lake D&C Red No. 13, the aluminium lake ofFD&C Yellow No. 5 and carbon black particles. Inert particles include,but are not limited to, calcium carbonate, aluminium silicate, calciumsilicate, magnesium silicate, mica, talc, barium sulfate, calciumsulfate, powdered Nylon™, perfluorinated alkanes, and other inertplastics.

The silk proteins may also be used in dental floss (see, for example, US2005/0161058). The floss may be monofilament yarn or multifilament yarn,and the fibers may or may not be twisted. The dental floss may bepackaged as individual pieces or in a roll with a cutter for cuttingpieces to any desired length. The dental floss may be provided in avariety of shapes other than filaments, such as but not limited to,strips and sheets and the like. The floss may be coated with differentmaterials, such as but not limited to, wax, polytetrafluoroethylenemonofilament yarn for floss.

The silk proteins may also be used in soap (see, for example, US2005/0130857).

Pigment and Cosmetic Particle Coating

The effective amount of a silk protein for use in pigment and cosmeticparticle coating is herein defined as a proportion of from about 10⁻⁴ toabout 50%, but preferably from about 0.25 to about 15% by weightrelative to the dry weight of particle. The optimum amount of the silkprotein to be used depends on the type of pigment or cosmetic particlebeing coated. For example, the amount of silk protein used withinorganic color pigments is preferably between about 0.01% and 20% byweight. In the case of organic pigments, the preferred amount of silkprotein is between about 1% to about 15% by weight, while for inertparticles, the preferred amount is between about 0.25% to about 3% byweight. Methods for the preparation of coated pigments and particles aredescribed in U.S. Pat. No. 5,643,672. These methods include: adding anaqueous solution of the silk protein to the particles while tumbling ormixing, forming a slurry of the silk protein and the particles anddrying, spray drying a solution of the silk protein onto the particlesor lyophilizing a slurry of the silk protein and the particles. Thesecoated pigments and cosmetic particles may be used in cosmeticformulations, paints, inks and the like.

Biomedical

The silk proteins may be used as a coating on a bandage to promote woundhealing. For this application, the bandage material is coated with aneffective amount of the silk protein. For the purpose of a wound-healingbandage, an effective amount of silk protein is herein defined as aproportion of from about 10⁻⁴ to about 30% by weight relative to theweight of the bandage material. The material to be coated may be anysoft, biologically inert, porous cloth or fiber. Examples include, butare not limited to, cotton, silk, rayon, acetate, acrylic, polyethylene,polyester, and combinations thereof. The coating of the cloth or fibermay be accomplished by a number of methods known in the art. Forexample, the material to be coated may be dipped into an aqueoussolution containing the silk protein. Alternatively, the solutioncontaining the silk protein may be sprayed onto the surface of thematerial to be coated using a spray gun. Additionally, the solutioncontaining the silk protein may be coated onto the surface using aroller coat printing process. The wound bandage may include otheradditives including, but not limited to, disinfectants such as iodine,potassium iodide, povidon iodine, acrinol, hydrogen peroxide,benzalkonium chloride, and chlorohexidine; cure accelerating agents suchas allantoin, dibucaine hydrochloride, and chlorophenylamine malate;vasoconstrictor agents such as naphazoline hydrochloride; astringentagents such as zinc oxide; and crust generating agents such as boricacid.

The silk proteins of the present invention may also be used in the formof a film as a wound dressing material. The use of silk proteins, in theform of an amorphous film, as a wound dressing material is described inU.S. Pat. No. 6,175,053. The amorphous film comprises a dense andnonporous film of a crystallinity below 10% which contains an effectiveamount of silk protein. For a film for wound care, an effective amountof silk protein is herein defined as between about 1 to 99% by weight.The film may also contain other components including but not limited toother proteins such as sericin, and disinfectants, cure acceleratingagents, vasoconstrictor agents, astringent agents, and crust generatingagents, as described above. Other proteins such as sericin may comprise1 to 99% by weight of the composition. The amount of the otheringredients listed is preferably below a total of about 30% by weight,more preferably between about 0.5 to 20% by weight of the composition.The wound dressing film may be prepared by dissolving the abovementioned materials in an aqueous solution, removing insolubles byfiltration or centrifugation, and casting the solution on a smooth solidsurface such as an acrylic plate, followed by drying.

The silk proteins of the present invention may also be used in sutures(see, for example, US 2005/0055051). Such sutures can feature a braidedjacket made of ultrahigh molecular weight fibers and silk fibers. Thepolyethylene provides strength. Polyester fibers may be woven with thehigh molecular weight polyethylene to provide improved tie downproperties. The silk may be provided in a contrasting color to provide atrace for improved suture recognition and identification. Silk also ismore tissue compliant than other fibers, allowing the ends to be cutclose to the knot without concern for deleterious interaction betweenthe ends of the suture and surrounding tissue. Handling properties ofthe high strength suture also can be enhanced using various materials tocoat the suture. The suture advantageously has the strength of EthibondNo. 5 suture, yet has the diameter, feel and tie-ability of No. 2suture. As a result, the suture is ideal for most orthopedic proceduressuch as rotator cuff repair, Achilles tendon repair, patellar tendonrepair, ACL/PCL reconstruction, hip and shoulder reconstructionprocedures, and replacement for suture used in or with suture anchors.The suture can be uncoated, or coated with wax (beeswax, petroleum wax,polyethylene wax, or others), silicone (Dow Corning silicone fluid 202Aor others), silicone rubbers, PBA (polybutylate acid), ethyl cellulose(Filodel) or other coatings, to improve lubricity of the braid, knotsecurity, or abrasion resistance, for example.

The silk proteins of the present invention may also be used in stents(see, for example, US 2004/0199241). For example, a stent graft isprovided that includes an endoluminal stent and a graft, wherein thestent graft includes silk. The silk induces a response in a host whoreceives the stent graft, where the response can lead to enhancedadhesion between the silk stent graft and the host's tissue that isadjacent to the silk of the silk stent graft. The silk may be attachedto the graft by any of various means, e.g., by interweaving the silkinto the graft or by adhering the silk to the graft (e.g., by means ofan adhesive or by means of suture). The silk may be in the form of athread, a braid, a sheet, powder, etc. As for the location of the silkon the stent graft, the silk may be attached only the exterior of thestent, and/or the silk may be attached to distal regions of the stentgraft, in order to assist in securing those distal regions toneighbouring tissue in the host. A wide variety of stent grafts may beutilized within the context of the present invention, depending on thesite and nature of treatment desired. Stent grafts may be, for example,bifurcated or tube grafts, cylindrical or tapered, self-expandable orballoon-expandable, unibody or, modular, etc.

In addition to silk, the stent graft may contain a coating on some orall of the silk, where the coating degrades upon insertion of the stentgraft into a host, the coating thereby delaying contact between the silkand the host. Suitable coatings include, without limitation, gelatin,degradable polyesters (e.g., PLGA, PLA, MePEG-PLGA, PLGA-PEG-PLGA, andcopolymers and blends thereof), cellulose and cellulose derivatives(e.g., hydroxypropyl cellulose), polysaccharides (e.g., hyaluronic acid,dextran, dextran sulfate, chitosan), lipids, fatty acids, sugar esters,nucleic acid esters, polyanhydrides, polyorthoesters andpolyvinylalcohol (PVA). The silk-containing stent grafts may contain abiologically active agent (drug), where the agent is released from thestent graft and then induces an enhanced cellular response (e.g.,cellular or extracellular matrix deposition) and/or fibrotic response ina host into which the stent graft has been inserted.

The silk proteins of the present invention may also be used in a matrixfor producing ligaments and tendons ex vivo (see, for example, US2005/0089552). A silk-fiber-based matrix can be seeded with pluripotentcells, such as bone marrow stromal cells (BMSCs). The bioengineeredligament or tendon is advantageously characterized by a cellularorientation and/or matrix crimp pattern in the direction of appliedmechanical forces, and also by the production of ligament and tendonspecific markers including collagen type I, collagen type III, andfibronectin proteins along the axis of mechanical load produced by themechanical forces or stimulation, if such forces are applied. In apreferred embodiment, the ligament or tendon is characterized by thepresence of fiber bundles which are arranged into a helicalorganization. Some examples of ligaments or tendons that can be producedinclude anterior cruciate ligament, posterior cruciate ligament, rotatorcuff tendons, medial collateral ligament of the elbow and knee, flexortendons of the hand, lateral ligaments of the ankle and tendons andligaments of the jaw or temporomandibular joint. Other tissues that maybe produced by methods of the present invention include cartilage (botharticular and meniscal), bone, muscle, skin and blood vessels.

The silk proteins of the present invention may also be used in hydrogels(see, for example, US 2005/0266992). Silk fibroin hydrogels can becharacterized by an open pore structure which allows their use as tissueengineering scaffolds, substrate for cell culture, wound and burndressing, soft tissue substitutes, bone filler, and as well as supportfor pharmaceutical or biologically active compounds.

The silk proteins may also be used in dermatological compositions (see,for example, US 2005/0019297). Furthermore, the silk proteins of theinvention and derivatives thereof may also be used in sustained releasecompositions (see, for example, US 2004/0005363).

Textiles

The silk proteins of the present invention may also be applied to thesurface of fibers for subsequent use in textiles. This provides amonolayer of the protein film on the fiber, resulting in a smoothfinish. U.S. Pat. No. 6,416,558 and U.S. Pat. No. 5,232,611 describe theaddition of a finishing coat to fibers. The methods described in thesedisclosures provide examples of the versatility of finishing the fiberto provide a good feel and a smooth surface. For this application, thefiber is coated with an effective amount of the silk protein. For thepurpose of fiber coating for use in textiles, an effective amount ofsilk protein is herein defined as a proportion of from about 1 to about99% by weight relative to the weight of the fiber material. The fibermaterials include, but are not limited to textile fibers of cotton,polyesters such as rayon and Lycra™, nylon, wool, and other naturalfibers including native silk. Compositions suitable for applying thesilk protein onto the fiber may include co-solvents such as ethanol,isopropanol, hexafluoranols, isothiocyanouranates, and other polarsolvents that can be mixed with water to form solutions ormicroemulsions. The silk protein-containing solution may be sprayed ontothe fiber or the fiber may be dipped into the solution. While notnecessary, flash drying of the coated material is preferred. Analternative protocol is to apply the silk protein composition onto wovenfibers. An ideal embodiment of this application is the use of silkproteins to coat stretchable weaves such as used for stockings.

Composite Materials

Silk fibres can be added to polyurethane, other resins or thermoplasticfillers to prepare panel boards and other construction material or asmoulded furniture and benchtops that replace wood and particle board.The composites can be also be used in building and automotiveconstruction especially rooftops and door panels. The silk fibresre-enforce the resin making the material much stronger and allowinglighterweight construction which is of equal or superior strength toother particle boards and composite materials. Silk fibres may beisolated and added to a synthetic composite-forming resin or be used incombination with plant-derived proteins, starch and oils to produce abiologically-based composite materials. Processes for the production ofsuch materials are described in JP 2004284246, US 2005175825, U.S. Pat.No. 4,515,737, JP 47020312 and WO 2005/017004.

Paper Additives

The fibre properties of the silk of the invention can add strength andquality texture to paper making. Silk papers are made by mottling silkthreads in cotton pulp to prepare extra smooth handmade papers is usedfor gift wrapping, notebook covers, carry bags. Processes for productionof paper products which can include silk proteins of the invention aregenerally described in JP 2000139755.

Advanced Materials

Silks of the invention have considerable toughness and stands out amongother silks in maintaining these properties when wet (Hepburn et al.,1979).

Areas of substantial growth in the clothing textile industry are thetechnical and intelligent textiles. There is a rising demand forhealthy, high value functional, environmentally friendly andpersonalized textile products. Fibers, such as those of the invention,that do not change properties when wet and in particular maintain theirstrength and extensibility are useful for functional clothing for sportsand leisure wear as well as work wear and protective clothing.

Developments in the weapons and surveillance technologies are promptinginnovations in individual protection equipments and battle-field relatedsystems and structures. Besides conventional requirements such asmaterial durability to prolonged exposure, heavy wear and protectionfrom external environment, silk textiles of the invention can beprocessed to resist ballistic projectiles, fire and chemicals. Processesfor the production of such materials are described in WO 2005/045122 andUS 2005268443.

EXAMPLES Example 1 Preparation and Analysis of Late Last Instar SalivaryEland cDNAs

The proteins that are found in euaculeatan and neuropteran (Apismellifera, Bombus terrestris, Myrmecia forficata, Oecophylla smaragdina,Mallada signata) silks were identified by matching ion trap consecutivemass spectral (MS/MS) fragmentation patterns of peptides obtained bytrypsin digestion of the silk with the predicted mass spectral data ofproteins encoded by cDNAs isolated from the salivary gland of late finalinstar larvae. For confirmation that no proteins were missed by thisanalysis for the honeybee, the peptide mass spectral data were alsocompared to virtual tryptic digests of Apis mellifera proteins predictedby the bee genome project and translations of the Amel3 honeybee genomicsequences in all six reading frames.

Honeybee

Apis mellifera larvae were obtained from domestic hives. Previously itwas shown that silk production in Apis mellifera is confined to thesalivary gland during the latter half of the final instar (Silva-Zacarinet al., 2003). During this period, RNA is more abundant in the posteriorend of the gland (Flower and Kenchington, 1967). The cubical cellregions of 50 salivary glands were dissected from late fifth instar Apismellifera immersed in phosphate buffered saline. The posterior end ofthe dissected gland was immediately placed into RNAlater® (Ambion,Austin, Tex., USA), to stabilise the mRNA, and subsequently stored at 4°C.

Total RNA (35 μg) was isolated from the late final instar salivaryglands using the RNAqueous for PCR kit from Ambion (Austin, Tex., USA).Message RNA was isolated from the total RNA using the Micro-FastTrack™2.0 mRNA Isolation kit from Invitrogen (Calsbad, Calif., USA) accordingto the manufacturer's directions with the isolated mRNA being elutedinto 10 ul RNAse free water.

A cDNA library was constructed from the mRNA isolated from Apismellifera larvae using the CloneMiner™ cDNA library construction kit ofInvitrogen (Calsbad, Calif., USA) with the following modifications fromthe standard protocol: For the first strand synthesis, 0.5 μl ofBiotin-attB2-Oligo(dT) primer at 6 pmol·μl⁻¹ and 0.5 μl of dNTPs at 2 mMeach was added to the 10 μl mRNA. After incubation at 65° C. for 5 minthen 45° C. for 2 min, 2 μl 5× First strand buffer, 1 μl of 0.1M DTT,and 0.5 μl SuperScript™ II RT at 200 U·μl⁻¹ were added. For secondstrand synthesis, the total volume of all reagents was halved and afterethanol precipitation, the cDNA was resuspended in 5 μl of DEPC-treatedwater. The aatB1 adapter (1 μl) was ligated in a total volume of 10 μlto the 5 μl cDNA with 2 μl 5× Adapter buffer, 1 μl 0.1M DTT and 1 μl T4DNA ligase (1 U·μl⁻¹) at 16° C. for 48 hrs with an additional 0.5 μl T4DNA ligase (1 U·μl⁻¹) added after 16 hrs. The cDNA was size fractionatedaccording to the manufactures instructions with samples eluting between300-500 μl being precipitated with ethanol, resuspended and transformedinto the provided E. coli DH10B™ T1 phage resistant cells asrecommended. The cDNA library comprised approximately 1,200,000 colonyforming units (cfu) with approximately 1% the original vector. Theaverage insert size was 1.3±1.4 kbp.

Eighty two clones were randomly selected and sequenced using theGenomeLab™ DTCS Quick start kit (BeckmanCoulter, Fullerton Calif. USA)and run on a CEQ8000 Biorad sequencer. These clustered into fifty fourgroups (Table 2). Identification of the cDNAs that encoded the silkproteins is described below.

Other Species

Total RNA was isolated from 4 bumblebee (Bombus terrestris) (2 μg RNA),4 bulldog ant (Myrmecia forficata) (3 μg RNA), approximately 100 Weaverants (Oecophylla smaragdina) (0.4 μg RNA) and approximately 50 greenlacewing (Mallada signata) late larval labial glands using the RNAqueousfor PCR kit from Ambion (Austin, Tex., USA). mRNA was isolated from thetotal RNA using the Micro-FastTrack™ 2.0 mRNA Isolation kit fromInvitrogen (Calsbad, Calif., USA) into a final volume of 10 μl water.cDNA libraries were constructed from the mRNA using the CloneMiner™ cDNAkit of Invitrogen (Calsbad, Calif., USA) with the followingmodifications from the standard protocol: For the first strandsynthesis, 3 pmol of Biotin-attB2-Oligo(dT) primer and 1 nmol each dNTPswere added to the 10 μl mRNA. After 5 min at 65° C. followed by 2 min at45° C., 2 dl 5× First strand buffer, 50 nmol DTT, and 100 U SuperScript™II RT were added.

TABLE 2 A. mellifera final instar salivary gland cDNAs and MS ion trapfragmentation patterns of peptides from trypsin digestion of SDS treatedbrood comb silk. Number Number of Distinct Coverage of Abundance trypticsummed of protein cDNA's in salivary Protein or peptides MS/MS sequencein gland gene identified search (% Protein cluster library (%) synonymsin the silk score protein) identification Proteins identified in cDNAlibrary and in honeybee silk 10 13 Xenosin; 9 143.89 25 AC004701GB15233-PA 8 11 Xenospira1; 10 165.13 37 No matches GB12184-PA 6 7Xenospira4; 8 142.16 35 No matches GB19585-PA 6 7 Xenospira2; 9 145.9128 No matches GB12348-PA 5 6 Xenospira3; 9 147.02 31 No matchesGB17818-PA Proteins identified in cDNA library only 4 4 GB14261-PA 0 2 2Contig 2504 0 2 2 GB17108-PA 0 1 1 Contig 68 0 1 1 Contig 110 0 1 1Contig 487 0 1 1 GB14199-PA 0 1 1 GB10847-PA 0 1 1 Contig 1047 0 1 1GB17558-PA 0 1 1 Contig 1471 0 1 1 GB16480-PA 0 1 1 Contig 1818 0 1 1GB16911-PA 0 1 1 Contig 2046 0 1 1 Contig 2136 0 1 1 Contig 2196 0 1 1GB11234-PA 0 1 1 GB11199-PA 0 1 1 GB18183-PA 0 1 1 Contig 2938 0 1 1Contig 2976 0 1 1 Contig 3263 0 1 1 Contig 3527 0 1 1 GB16412-PA 0 1 1GB18750-PA 0 1 1 GB16132-PA 0 1 1 Contig 4536 0 1 1 GB19431-PA 0 1 1Contig 4704 0 1 1 Contig 4758 0 1 1 Contig 4830 0 1 1 Contig 4968 0 1 1Contig 5402 0 1 1 Contig 5971 0 1 1 GB11274-PA 0 1 1 GB14693-PA 0 1 1GB19585-PA 0 1 1 GB15606-PA 0 1 1 GB16801-PA 0 1 1 GB12085-PA 0 1 1Contig 7704 0 1 1 Contig 8630 0 1 1 Contig 9774 0 1 1 GB16452-PA 0 1 1GB10420-PA 0 1 1 GB14724-PA 0

For second strand synthesis, the total volume of all reagents was halvedfrom the manufacturer's recommended amounts and after ethanolprecipitation, the cDNA was resuspended in 5 μl of DEPC-treated water.The aatB1 adapter (1 μl) was ligated in a total volume of 10 μl to the 5μl cDNA with 2 μl 5× Adapter buffer, 50 nmol DTT and 1 U T4 DNA ligaseat 16° C. for 12 hrs. The cDNA libraries comprised approximately 2.4×10⁷(bumblebee), 5.0×10⁷ (bulldog ant) and 6000 (green ant) colony formingunits (cfu) with less than 1% the original vector for the bulldog antand bumblebee libraries and greater than 80% original vector in thegreen ant library. The average insert size within the libraries was 1.3Kbp.

Sequence data was obtained from more than 100 random clones from thecDNA libraries from bumblebee and bulldog ant, 82 clones from thehoneybee and 60 clones from the lacewing. The technical difficulties ofobtaining salivary glands from the minute green ants (approximately 1 mmin length) reduced the efficiency of the library from this species andas such only 40 sequences were examined. A summary of the silk proteinsidentified is provided in Table 3.

TABLE 3 Identification and properties of the euaculeatan silk proteins.Length Distinct of % summed MARCOIL protein cDNA MS/MS predicted coiledProtein (amino library identification % helical coil length*** Speciesname acids) clones score structure** (amino acids) Honeybee AmelF1* 3336 52 76 117 Honeybee AmelF2* 290 7 51 88 175 Honeybee AmelF3* 335 11 10781 154 Honeybee AmelF4* 342 7 88 76 174 Honeybee AmelSA1* 578 13 40 4145 Bumblebee BBF1 327 4 180 86 147 Bumblebee BBF2 313 14 100 84 199Bumblebee BBF3 332 20 218 86 146 Bumblebee BBF4 357 32 137 80 188Bumblebee BBSA1 >501 3 138 21 0 Bulldog ant BAF1 422 16 99 69 121Bulldog ant BAF2 411 30 90 76 132 Bulldog ant BAF3 394 26 88 79 131Bulldog ant BAF4 441 24 116 76 157 Weaver ant GAF1 391 35 228 74 177Weaver ant GAF2 400 22 191 79 158 Weaver ant GAF3 395 13 156 72 103Weaver ant GAF4 443 17 148 74 166 Lacewing MalF1 596 23 45 89 151 *alsoreferred to herein as Xenospira1-4 and Xenosin respectively, **predictedby PROFsec, ***predicted by MARCOIL at 90% threshold

Example 2 Preparation and Proteomic Analysis of Native Silk

Honeybee brood comb after the removal of larvae, bumblebee cocoons afterthe removal of larvae, bulldog ant cocoons after the removal of larvae,or weaver ant silk sheets were washed extensively three times in warmwater to remove water soluble contaminants and then washed extensivelythree times in chloroform to remove wax. Chloroform was removed byrinsing in distilled water and a subset of this silk was retained foranalysis. A subset of the Hymenopteran (ants and bees) silk samples wasfurther washed by boiling for 30 minutes in 0.05% sodium carbonatesolution, a standard procedure for degumming silkworm silk, then rinsedin distilled water. Lacewing silk was rinsed in distilled water only. Asubset of the lacewing silk samples was degummed by boiling for 30minutes in 0.05% sodium carbonate solution.

A subset of the honeybee material was soaked overnight in 2% SDS at 95°C., followed by three washes in distilled water. Extraction in hot SDSsolution solubilises most proteins, but in this case the silk sheetsretained their conformation.

The clean silks were analysed by liquid chromatography followed bytandem mass spectrometry (LCMS) as described below.

Pieces of cleaned silk were placed in a well of a Millipore ‘zipplate’,a 96 well microtitre tray containing a plug of C18 reversed phasechromatography medium through the bottom of each well to which was added20 μl 25 mM ammonium bicarbonate containing 160 ng of sequencing gradetrypsin (Promega). Then the tray was incubated overnight in a humidifiedplastic bag at 30° C.

The C18 material was wetted by pipetting acetonitrile (10 μl) to thesides of each well and incubating the plate at 37° C. for 15 min. Formicacid solution (130 μl, 1% v/v) was added to each well and after 30 minpeptides from the digested bee proteins were captured on the C18material by slowly drawing the solutions from each well through the baseof the plate under a reduced vacuum. The C18 material was washed twiceby drawing through 100 μl of formic acid solution. Peptides were elutedwith 6 μl of 1% formic acid in 70% methanol pipetted directly onto theC18 material and promptly centrifuged through the C18 plug to anunderlying microtitre tray. This tray was placed under vacuum till thevolume in each well was reduced about 2-fold by evaporation. Formic acidsolution (10 μl) was added to each well and the tray was transferred tothe well plate sampler of an Agilent 1100 capillary liquidchromatography system.

Peptides (8 μl) from the silk extract were bound to an Agilent ZorbaxSB-C18 5 μm 150×0.5 mm column with a flow rate of 0.1% formic acid/5%acetonitrile at 20 μl·min⁻¹ for one min then eluted with gradients ofincreasing acetonitrile concentration to 0.1% formic acid/20%acetonitrile over one minute at 5 μl·min⁻¹, then to 0.1% formic acid/50%acetonitrile over 28 minutes, then to 0.1% formic acid/95% acetonitrileover one minute. The column was washed with 0.1% formic acid/95%-100%acetonitrile over 5 mins at 20 μl·min⁻¹ and reequilibrated with 0.1%formic acid/5% acetonitrile for 7 mins before peptides from the nextwell were sampled.

Eluate from the column was introduced to an Agilent XCT ion trap massspectrometer through the instrument's electrospray ion source fittedwith a micronebuliser. Briefly, as peptides were eluting from thecolumn, the ion trap collected full spectrum positive ion scans(100-2200 m/z) followed by two MS/MS scans of ions observed in the fullspectrum avoiding the selection of ions that carried only a singlecharge. When an ion was selected for MS/MS analysis all others wereexcluded from the ion trap, the selected ion was fragmented according tothe instrument's recommended “SmartFrag” and “Peptide Scan” settings.Once two fragmentation spectra were collected for any particular m/zvalue it was excluded from selection for analysis for a further 30seconds to avoid collecting redundant data.

Mass spectral data sets from the entire experiment were analysed usingAgilent's Spectrum Mill software to match the data with predictions ofprotein sequences from the cDNA libraries. The software generated scoresfor the quality of each match between experimentally observed sets ofmasses of fragments of peptides and the predictions of fragments thatmight be generated according to the sequences of proteins in a provideddatabase. All the sequence matches reported here received scores greaterthan 20, the default setting for automatic, confident acceptance ofvalid matches.

This analysis identified that five proteins expressed at high levels inthe labial gland matched the silk from each of the cognate bee species(shown in Tables 2 and 3) and four proteins expressed at high levels inthe labial gland matched the silk from each of the cognate ant species(shown in Table 3). The abundance of message RNA encoding these proteinsin the labial gland of the larvae was consistent with the proteins beingabundantly produced (abundance of message shown in Table 3).

To ensure that none of the honeybee silk proteins were missed by thisidentification process, we also compared the honeybee silk trypsinpeptide mass spectral data to a set of publicly available predictedprotein sequences from the honeybee genome project, generated by acomputer algorithm that tries to recognise transcribed genes in thecomplete genomic DNA sequences of the bee. Additionally, we generated adatabase of translations in the six possible reading frames of eachcontiguous genomic DNA sequence provided by the bee genome project(Amel3 release). These translated DNA sequences were presented to theSpectrum Mill software as if they were the sequences of very largeproteins. Matching MS/MS peptide data identified open reading frameswithin the genomic sequences that had encoded parts of the isolated beeproteins without the need to first predict the organisation of genes. Noadditional proteins were identified in the silk by this analysis.

Example 3 Structural Analysis of the Native Silk

Native silk samples were prepared as described in Example 2. Silksamples were examined using a Bruker Tensor 37 Fourier transforminfrared spectrometer with a Pike Miracle diamond attenuated totalreflection accessory. Analysis of the amide I and II regions of thespectra of honeybee, bumblebee, green ant, bulldog ant silks andlacewing larval silk (FIG. 1) shows that all these silks have apredominantly alpha-helical secondary structure. The silks of theEuaculeatan species have dominant peaks in the FT-IR spectra at1645-1646 cm⁻¹, shifted approximately 10 cm⁻¹ lower than a classicalα-helical signal and broadened. This shift in the α-helical signal istypical of coiled-coil proteins (Heimburg et al., 1999). Spectra fromsamples that were degummed were unchanged.

Example 4 The Amino Acid Composition of Native Silks Closely Resemblesthat of the Identified Silk Proteins

The amino acid composition of the native silks was determined after 24hr gas phase hydrolysis at 110° C. using the Waters AccQTag chemistry byAustralian Proteome Analysis Facility Ltd (Macquarie University,Sydney).

The measured amino acid composition of the SDS washed silk was similarto that predicted from the identified silks protein sequences (FIGS. 2and 3).

Example 5 Structural Analysis of the Silk Proteins Predicted SecretoryPeptides

As expected for silk proteins, the SignalP 3.0 signal prediction program(Bendtsen et al., 2004), which uses two models to identify signalpeptides predicted that all the identified silk genes encoded proteinswhich contain signal peptides that targeted them for secretion from acell (data not shown). The predicted cleavage sites of the polypeptidesare as follows:

Xenospira1 (AmelF1)—between pos 19 and 20 (ASA-GL),

Xenospira2 (AmelF2)—between pos 19 and 20 (AEG-RV),

Xenospira3 (AmelF3)—between pos 19 and 20 (VHA-GV),

Xenospira4 (AmelF4)—between pos 19 and 20 (ASG-AR),

Xenosin (AmelSA1)—between pos 19 and 20 (VCA-GV),

BBF1—between pos 19 and 20 (ASA-GQ),

BBF2—between pos 20 and 21 (AEG-HV),

BBF3—between pos 19 and 20 (VHA-GS),

BBF4—between pos 19 and 20 (ASA-GK),

BAF1—between pos 19 and 20 (ASA-SG),

BAF2—between pos 19 and 20 (ASG-RV),

BAF3—between pos 19 and 20 (ASG-NL),

BAF4—between pos 19 and 20 (VGA-SE),

GAF1—between pos 19 and 20 (ADA-SK),

GAF2—between pos 19 and 20 (ASG-GV),

GAF3—between pos 19 and 20 (ASG-GV),

GAF4—between pos 19 and 20 (VGA-SE),

MalF1—between pos 26 and 27 (SST-AV).

All Four of the Ant and Four of the Five Bee Silk Proteins are Helicaland Formed Coiled Coils

Protein modelling and results from pattern recognition algorithmsconfirmed that the majority of the identified honeybee silk proteinswere helical proteins that formed coiled coils.

PROFsec (Rost and Sander, 1993) and NNPredict (McClelland and Rumelhart,1988; Kneller et al., 1990), algorithms were used to investigate thesecondary structure of the identified silk genes. These algorithmsidentified Xenospira1 [GB12184-PA](SEQ ID NO:1), Xenospira2 [GB12348-PA](SEQ ID NO:3), Xenospira3 [GB17818-PA] (SEQ ID NO:5), and Xenospira4[GB19585-PA] (SEQ ID NO:7), as highly helical proteins, with between76-85% helical structure (Table 4). Xenosin [GB15233-PA](SEQ ID NO:10)had significantly less helical structure.

TABLE 4 The secondary structure of Apis mellifera silk proteinspredicted by PROFsec (Rost and Sander, 1993) showing percentages ofhelices, extended sheets and loops. helical extended loop ProteinPROFsec NNPredict PROFsec NNPredict PROFsec NNPredict Xenospira3 77 70 36 20 27 Xenospira4 85 82 2 6 14 16 Xenospira1 80 73 1 4 19 26 Xenospira277 69 2 5 21 29 Xenosin 41 41 8 9 51 50

Further protein modelling and results from pattern recognitionalgorithms confirmed that the majority of the identified silk proteinswere helical proteins that formed coiled coils. PredictProtein (Rost etal., 2004) algorithms were used to investigate the secondary structureof the identified silk genes. These algorithms identified Xenospira1(SEQ ID NO:1), Xenospira2 (SEQ ID NO:3), Xenospira3 (SEQ ID NO:5),Xenospira4 (SEQ ID NO:7), BBF1 (SEQ ID NO:22), BBF2 (SEQ ID NO:24), BBF3(SEQ ID NO:26), BBF4 (SEQ ID NO:28), BAF1 (SEQ ID NO:40), BAF2 (SEQ IDNO:42), BAF3 (SEQ ID NO:44), BAF4 (SEQ ID NO:46), GAF1 (SEQ ID NO:56),GAF2 (SEQ ID NO:58), GAF3 (SEQ ID NO:60), GAF4 (SEQ ID NO:62), and MalF1(SEQ ID NO:72) as highly helical proteins, with between 69-88% helicalstructure (Table 3). AmelSA1 [GB15233-PA] (Xenosin) (SEQ ID NO:10) andBBSA1 (SEQ ID NO:30) had significantly less helical structure.

Super-coiling of helical proteins (coiled coils) arises from acharacteristic heptad repeat sequence normally denoted as (abcdefg)_(n)with generally hydrophobic residues in position a and d, and generallycharged or polar residues at the remaining positions. The patternrecognition programs (MARCOIL (Delorenzi and Speed, 2002), COILS (Lupaset al., 1991)) identified numerous heptad repeats typical ofcoiled-coils in Xenospira1 [GB12184-PA] (SEQ ID NO:1), Xenospira2[GB12348-PA] (SEQ ID NO:3), Xenospira3 [GB17818-PA] (SEQ ID NO:5), andXenospira4 [GB19585-PA](SEQ ID NO:7) (MARCOIL: Table 5; COILS: FIG. 4),as well as BBF1 (SEQ ID NO:22), BBF2 (SEQ ID NO:24), BBF3 (SEQ IDNO:26), BBF4 (SEQ ID NO:28), BAF1 (SEQ ID NO:40), BAF2 (SEQ ID NO:42),BAF3 (SEQ ID NO:44), BAF4 (SEQ ID NO:46), GAF1 (SEQ ID NO:56), GAF2 (SEQID NO:58), GAF3 (SEQ ID NO:60), GAF4 (SEQ ID NO:62), and MalF1 (SEQ IDNO:72) (MARCOIL: Table 3).

Identification of a Novel Coiled Coil Sequence in the Honeybee SilkProteins

The heptad repeats of amino acid residues identified in the sequences ofXenospira1 [GB12184-PA], Xenospira2 [GB12348-PA], Xenospira3[GB17818-PA], Xenospira4 [GB19585-PA], were each highly indicative of acoiled coil secondary structure (FIG. 5) (see Table 5 for confidencelevels). The fact that the heptads are found consecutively andnumerously suggests the proteins adopt a very regular structure.Overlapping heptads were identified in two of the honeybee proteins: themajor coiled coil region of Xenospira1 contained overlapping heptadswith a 3 residue offset followed by a space of 5 residues and then fourconsecutive heptads; and the entire coiled coil region of Xenospira2 hadmultiple overlapping heptads with a single offset and 4 residue offset(equivalent to 3 residue offset). The composition of amino acids in thevarious positions of the major heptad are shown in the first column inTable 6, with the positions of the overlapping heptads indicated inadjacent columns.

TABLE 5 Percent of residues in the identified silk proteins predicted toexist as coiled coil by the MARCOIL (Delorenzi and Speed, 2002) patternrecognition algorithm. Length of mature protein Percent protein thatexists as coiled coil (amino 50% 90% 99% Protein acids) thresholdthreshold threshold Xenospira3 315 64% 34% 20% (residues 68 (residues128- (residues 149- to 268) 223 and 235- 211) 246) Xenospira4 290 73%60% 27% (residues 83- (residues 98-  (residues 113- 293) 168 and 182-154 and 212- 2.85)  247) Xenospira1 316 69% 49% 18% (residues 67-(residues 103- (residues 113- 282) 256) 169) Xenospira2 328 65% 54% 45%(residues 89- (residues 110- (residues 127- 298) 283) 270) Xenosin 35026%  9%  2% (residues 32- (residues 42-  (residues 59-  127)  75)  67)

Surprisingly the major heptads have a novel composition when viewedcollectively—with an unusually high abundance of alanine in the‘hydrophobic’ heptad positions a and d (see Table 6 and FIG. 5).Additionally, a high proportion of heptads have alanine at both a and dpositions within the same heptad (33% in Xenospira1 [GB12184-PA]; 36% inXenospira2 [GB12348-PA]; 27% in Xenospira3 [GB17818-PA]; and 38% inXenospira4 [GB19585-PA]; see Tables 6 and 7).

TABLE 6 Summary of the number of each amino acid residues in the variousheptad positions in coiled coil regions of honeybee silk proteins.Xenospira4 A I R L K T E V F S Q N D G M Y W Total a 23 0 1 1 0 1 1 1 01 0 0 0 0 0 0 0 29 b 12 0 0 2 2 2 3 1 0 3 1 1 1 1 0 0 0 29 c 12 0 0 1 51 3 1 0 3 1 1 0 1 0 0 0 29 d 17 0 0 5 1 0 1 2 0 2 1 0 0 0 0 0 0 29 e 120 1 0 0 2 4 2 0 5 2 1 0 0 0 0 0 29 f 13 1 0 1 2 0 7 1 0 1 1 2 0 0 0 0 029 g  9 3 4 0 2 1 2 1 0 2 0 1 2 2 0 0 0 29 Xenospira3 A I R L K T E V FS Q N D G M Y W Total a 19 0 0 1 0 4 2 0 0 1 1 1 0 0 0 1 0 30 b  8 0 0 51 2 2 0 0 5 4 2 1 0 0 0 0 30 c 13 0 1 0 3 2 2 3 0 1 2 0 1 1 0 0 1 30 d13 3 0 2 2 0 2 2 0 4 0 1 1 0 0 0 0 30 e  8 0 0 2 2 2 4 0 0 7 4 0 0 1 0 00 30 f  7 0 2 3 4 2 4 0 0 4 1 2 1 0 0 0 0 30 g  9 0 5 2 3 0 1 2 0 5 0 21 0 0 0 0 30 Xenospira2 A I R L K T E V F S Q N D G M Y W Total a 20 0 01 0 3 1 1 1 1 0 0 0 0 0 0 0 28 b  7 2 2 2 2 2 2 4 0 1 1 3 0 0 0 0 0 28 c 9 0 2 0 4 1 2 4 0 1 3 2 0 0 1 1 1 28 d 16 0 0 3 3 1 0 1 0 1 2 0 1 0 0 00 28 e 11 0 1 3 0 3 4 1 0 2 2 0 1 0 0 0 0 28 f 10 2 1 0 1 2 6 1 0 3 1 10 0 0 0 0 28 g  3 4 1 0 1 1 5 0 0 0 2 4 0 1 1 0 0 28 Xenospira1 A I R LK T E V F S Q N D G M Y W Total a 13 3 0 1 2 0 1 1 0 2 1 1 0 2 0 0 0 27b  7 1 1 1 6 0 2 1 0 3 1 0 4 0 0 0 0 27 c  8 1 2 1 1 1 7 2 0 1 1 1 0 1 00 0 27 d 18 0 0 2 1 2 1 0 2 1 0 0 0 0 0 0 27 e 11 1 2 1 1 2 3 2 0 4 0 00 0 0 0 0 27 f  7 0 3 0 2 1 3 3 0 7 0 1 0 0 0 0 0 27 g 13 0 0 3 3 0 2 10 3 1 0 0 0 1 0 0 27

TABLE 7 Summary of alanine residues in heptads of honeybee silkproteins. Amount Amount of Ala of Ala Amount in in Amount of Amountposition Amount of position of protein of Ala a of Ala in a and dhelical Number in major in major major position d of of major structureof major heptad heptads heptads major heptads Protein (%)¹ heptads (%)(%) (%) heptads (%) (%) Xenospira1 77 (70) 27 41 44 74 33 Xenospira2 85(82) 28 37 71 57 36 Xenospira3 80 (73) 30 37 63 43 27 Xenospira4 77 (69)29 48 79 58 38 Xenosin 41 (41) n/a n/a n/a n/a ¹PROFsec predictions withNNPredict predictions shown in brackets.

The composition of amino acids in the various heptad positions in thecoiled coil region of the hymenopteran silks are summarised in FIGS. 6and 7. As noted above, the positions within the heptads have a novelcomposition—the ‘hydrophobic’ heptad positions a and d of the bee andant silks contain very high levels of alanine (average 58%) and highlevels of small polar residues (average 21%) in comparison to othercoiled coils. Additionally, position e is unusually small andhydrophobic (Table 8, FIG. 7). Topographically this position is locatedadjacent to the a residues within the helices. Its compositionalsimilarity with the a and d residues suggest that the silks adopt acoiled coil structure with three core residues per α-helix. Threeresidue cores contribute a larger hydrophobic interface than tworesidues in the core (Deng et al., 2006)—a feature that would assistcoiled coil formation and stability.

In addition, when viewed collectively the positions b, c, e, f and gwithin the heptad are generally more hydrophobic, less polar and lesscharged than protein coiled coil regions previously characterised (seeFIG. 7, and Tables 8 and 9). Therefore, although historically it wasregarded that the helical content of the aculeate Hymenopteran silk wasa consequence of a reduced glycine content and increased content ofacidic residues (Rudall and Kenchington, 1971), we have discovered thatit is not the glycine/acid residues that are responsible for the novelsilk structure but rather the position of the alanine residues withinthe polypeptide chains.

TABLE 8 Average size and hydrophobicity at each heptad position of theorthologous hymenopteran silk proteins and of the green lacewing silkprotein (MalF1) showing that a, d, and e positions (core) are smallerand more hydrophobic than other positions. In some cases the b position(partially submerged) is also small and hydrophobic. Heptad position a bc d e f g Amel F1 orthologs Average residue 0.36 0.20 0.20 0.30 0.26−0.16 0.03 side chain hydrophobicity Average residue 1.7 2.5 2.5 2.1 2.33.0 2.6 side chain length Amel F2 orthologs Average residue 0.53 0.200.03 0.36 0.24 0.05 0.12 side chain hydrophobicity Average residue 1.52.6 2.6 2.0 2.2 2.5 3.0 side chain length Amel F3 orthologs Averageresidue 0.44 0.36 0.06 0.41 0.27 −0.10 0.00 side chain hydrophobicityAverage residue 1.9 2.3 2.4 2.1 2.3 2.8 2.8 side chain length Amel F4orthologs Average residue 0.46 0.17 −0.13 0.61 0.04 0.06 0.06 side chainhydrophobicity Average residue 1.4 2.2 2.6 2.04 2.3 2.6 2.7 side chainlength MalF1 Average residue −0.05 0.14 −0.61 0.27 0.59 0.23 −0.22 sidechain hydrophobicity Average residue 2.1 1.7 2.5 1.4 1.5 1.7 3.5 sidechain length

Example 6 The Bee Silk Proteins are Likely to be ExtensivelyCross-Linked

The bee silk proteins all contain a high proportion of lysine(6.5%-16.3%). A comparison between the measured amino acid compositionof bee silk and the sequences of the identified silk proteins reveals asubstantial mismatch in the number of lysine residues, with much lesslysine detected in the silk than expected (FIGS. 2 and 3). This suggeststhat lysine residues in the silk have been modified, so are not beingidentified by standard amino acid analysis. Lysine is known to form avariety of cross-links: either enzymatic cross links catalysed by lysyloxidase or nonenzymatic cross links generated from glycated lysineresidues (Reiser et al., 1992). The under-representation of lysine inthe honeybee and bumblebee silk amino acid analysis is consistent withthe presence of lysine cross-linking

TABLE 9 Number of residues in each class of amino acids at variousheptad positions in coiled coil regions of silk proteins. Nonpolar PolarCharged Small Medium Large Heptad position Xenospira4 25 2 2 26 2 1 a 167 6 19 10 0 b 15 6 8 18 11 0 c 24 3 2 21 8 0 d 14 10 5 21 7 1 e 16 4 915 14 0 f 15 4 10 15 10 4 g Xenospira3 20 8 2 24 5 1 a 13 13 4 15 15 0 b17 6 7 20 8 2 c 20 5 5 19 11 0 d 11 13 6 18 12 0 e 10 9 11 13 15 2 f 137 10 16 9 5 g Xenospira2 23 4 1 25 2 1 a 15 7 6 14 12 2 b 13 7 8 15 11 2c 20 4 4 19 9 0 d 15 7 6 17 10 1 e 13 7 8 16 11 1 f 14 7 7 10 17 1 gXenospira1 20 4 3 18 9 0 a 10 4 13 11 15 1 b 13 4 10 13 12 2 c 20 5 2 225 0 d 15 6 6 19 6 2 e 10 9 8 18 6 3 f 18 4 5 17 10 0 g

Covalently cross-linked proteins subjected to SDS polyacrylamide gelelectrophoresis (PAGE) are expected to migrate according to themolecular weight of the cross-linked complex. We subjected late lastinstar honeybee labial gland proteins to SDS PAGE and measured themigration of the silk proteins in relation to standard protein markers.Bands were observed corresponding to monomers of each of the identifiedsilk proteins, however higher molecular weight bands containing theseproteins were also present, as expected in a cross-linked system (FIG.8).

As described above, the honeybee labial gland contains a mixture oforganised and disorganised silk proteins. The cross-linked proteinsobserved probably correspond to the protein population of the anteriorregion of the gland, where the silk is prepared for extrusion. It isreasonable to assume that extracellular honeybee silk contains asubstantially higher proportion of cross-linked proteins than isobserved in a heterogenous mixture of all stages of salivary gland silkproteins. The bonds are unlikely to be cysteine cross-links, as the silkwas unaffected by reductive treatment, and the identified silk proteinscontain few or no cysteine residues.

Example 7 The Euaculeatan Silk Proteins Differ Significantly from theOther Silk Proteins

The euaculeatan silk is significantly different from other describedsilk genes in relation to amino acid composition (Table 10), molecularweight of the proteins involved, secondary structure and physicalproperties (Tables 11 and 12). The lepidopteran silks are primarilycomposed of the small amino acid residues alanine, serine and glycine(for example the silk of Bombyx mori, Table 10) and are dominated byextended beta sheet secondary structure. The Cotesia glomerata silkprotein is high in asparagine and serine—the abundance of the latterresidue being characteristic of Lepidopteran silk sericins (glues)(Table 10). Modelling of the Cotesia glomerata silk protein does notidentify helices or coiled coils in the secondary structure. Incontrast, the bee, ant and lacewing silks are high in alanine (Table 10)and are comprised of a high level of helical secondary structure thatforms coiled coils.

TABLE 10 Amino acid composition of silk from various Insects with mostabundant residues shown in boldface. Euaculeatan Mallada Cotesia BombyxHoneybee silk silk glomerata mori Alanine 22.6 27.5 26.9 12.5 29.3Glutamic 16.1 13.9 7.4 0.6 0.9 acid + Glutamine Aspartic 13.2 8.6 15.037.6 1.2 acid + (Asn 33.7) Asparagine Serine 10.4 11.5 8.5 37.1 11.3Leucine 9.0 7.2 5.9 0.4 0.4 Valine 6.6 4.8 4.1 0.3 2.1 Glycine 5.7 6.611.2 5.5 46.0 Isoleucine 5.6 4.0 3.9 0.4 0.6 Threonine 5.1 4.9 5.3 0.50.8 Lysine 3.7 3.7 3.2 0.1 0.3 Phenyl- 2.0 1.0 0.5 0.5 0.6 alanineTyrosine 0 0.9 0.5 3.1 5.3 Proline 0 0 0 0.7 0.4 Histidine 0 0.5 0.5 0.40.2 Arginine 0 3.3 5.4 0.2 0.4 Methionine 0 1.0 1.6 0 0.1 Tryptophan 0Not Not Not 0.2 reported reported reported Cysteine 0 0.4 0.3 Not 0.1reported

TABLE 11 Differences between insect silks. Lepidoptera Ant and beeMallada For example silk silk Cotesia sp. Bombyx mori Most Ala Ala Ser,Asn Gly, Ala abundant amino acids Size of 25-35 kDa 57 KDa Approx >100KDa fibroin 500 KDa proteins Secondary Coiled coil Coiled coil Mostlikely beta-pleated structure beta sheets. sheets Secondary looselystructure associated prediction with beta- programs sheets, PROFsec andbeta- MARCOIL do spirals, not recognise alpha any helical helices andstructure or amorphous coiled coil regions regions.

TABLE 12 Solubility of insect silks. Ant and bee silk Mallada silkCotesia sp. Bombyx mori Solvent 20° C. 95° C. 20° C. 95° C. 20° C. 95°C. 20° C. 95° C. LiBr 54% — — — — — part — ✓ LiSCN saturated — — — — —part — ✓ 8M urea — — — — — — — part 6M guanidine HCl — — — — — — — part1M NaOH — part ? ? — part part ✓ 6M HCl — part part ✓ — part — ✓ 3MHCl/50% — part ? ? — part part ✓ propanoic acid

Cladistic analysis of the coiled coil regions of the silk proteins ofthe four Hymenopteran species (FIG. 9) suggests that the genes evolvedin a common ancestor that predates the divergence of the Euaculeata fromthe parasitic wasps. The sequences of the silk have diverged extensivelyand we were only able to align the 210 amino acids that comprise thecoiled coil region of each protein. The amino acid sequence identitybetween the coiled coil regions of each of the silk proteins providedherein is shown in Table 13 and DNA identity in the corresponding regionis shown in Table 14. Whilst the proteins have similar amino acidcontents (especially high levels of alanine) and tertiary structure, theprimary amino acid sequence identity is very low. In fact, the geneencoding the Mallada silk protein has evolved independently and as suchthe silk protein sequence cannot be aligned to the Hymenopteransequences. This indicates that considerable variety in the identity ofthe amino acids can occur, whilst not affecting the biological functionof the proteins.

The cladistic analysis predicts that silk of euaculeatan wasps iscomprised of related proteins to the silk of ants and bees and thatalthough these proteins will have similar composition and architectureto the proteins described here, they will have highly diverged primarysequence.

The amino acid sequences of the silk proteins provided herein (FIG. 10)were subjected to comparisons with protein databases, however, no priorart proteins were identified with any reasonable level of sequenceidentity (for example, none greater than 30% identical over the lengthof the silk protein sequence).

TABLE 13 Percent identity between protein sequences of the coiled coilregion of the fibre proteins in ants and bees. Honeybee BumblebeeBulldog ant Green ant F1 F2 F3 F4 F1 F2 F3 F4 F1 F2 F3 F4 F1 F2 F3 F4beeF1 100 beeF2 26.7 100 beeF3 23.3 31.4 100 beeF4 34.8 32.4 30.0 100BBF1 65.7 28.1 24.8 35.7 100 BBF2 28.6 71.4 28.6 31.9 31.0 100 BBF3 25.231.0 65.7 27.6 27.1 29.5 100 BBF4 33.3 31.0 29.5 64.8 34.8 31.4 28.1 100BAF1 37.1 20.0 20.0 32.4 39.5 21.4 21.4 29.1 100 BAF2 25.2 44.3 29.533.8 28.1 38.1 28.6 27.6 27.1 100 BAF3 23.8 26.2 36.7 28.1 24.8 25.236.7 28.1 21.0 27.6 100 BAF4 28.1 33.8 24.8 45.2 28.6 33.8 23.3 43.826.1 27.6 25.2 100 GAF1 33.8 20.0 23.8 32.9 36.2 22.9 23.8 29.1 66.728.1 25.2 28.6 100 GAF2 24.8 41.9 27.6 29.5 28.1 39.5 29.0 26.7 21.966.2 23.8 26.7 23.8 100 GAF3 26.9 28.8 40.1 31.6 25.5 28.3 38.2 30.224.0 28.3 62.7 27.4 27.4 26.4 100 GAF4 24.7 32.4 24.3 37.6 27.1 32.424.8 38.1 23.9 29.5 21.0 63.3 24.8 27.6 24.1 100

TABLE 14 Percent identity between nucleotide sequences encoding coiledcoil region of the fibre proteins in ants and bees. Honeybee BumblebeeBulldog ant Green ant F1 F2 F3 F4 F1 F2 F3 F4 F1 F2 F3 F4 F1 F2 F3 F4beeF1 100 beeF2 39.4 100 beeF3 37.0 40.2 100 beeF4 45.1 44.8 41.0 100BBF1 68.9 40.9 37.5 45.2 100 BBF2 42.5 72.9 42.5 44.9 42.2 100 BBF3 40.640.0 67.6 40.5 38.4 41.0 100 BBF4 45.4 41.0 41.7 66.0 45.9 43.6 40.0 100BAF1 45.7 35.1 35.9 41.1 47.9 36.5 36.0 38.7 100 BAF2 38.1 49.8 41.444.6 38.7 47.3 40.0 41.0 40.6 100 BAF3 33.3 36.7 45.4 40.3 36.3 36.846.2 39.4 36.0 40.5 100 BAF4 39.5 43.3 41.4 46.8 43.0 47.6 39.8 49.442.5 41.7 40.3 100 GAF1 45.6 35.1 37.3 42.4 47.6 38.5 37.8 41.4 68.941.7 36.7 43.0 100 GAF2 38.5 47.8 38.4 43.2 38.1 46.5 41.4 40.0 37.569.7 38.9 40.6 39.4 100 GAF3 39.0 40.1 46.1 41.8 37.7 39.3 46.1 40.037.7 41.7 65.1 41.2 40.0 41.7 100 GAF4 38.9 42.4 38.1 44.9 38.9 43.838.4 44.3 37.3 42.7 36.7 67.8 38.2 40.3 37.7 100

The open reading frames encoding the silk proteins (provided on FIG. 11)were subjected to similar database searching as that described above.The only related molecules that were identified have been published aspart of the honeybee genome project(www.ncbi.nlm.nih.gov/genome/guide/bee). The open reading frames hadbeen predicted by the bee genome project, however, the function of theencoded proteins had not been suggested. Furthermore, there is noevidence that a polynucleotide comprising the open reading frame of themRNA had ever been produced for any of these molecules.

The genes encoding Xenospira1, Xenospira2, Xenospira3 and Xenospira4comprise an exon covering the entire single open reading frame, whereasthe gene encoding Xenosin comprises at least one intron (see FIG. 12).

Example 8 Expression of Silk Proteins in Transgenic Plants

A plant expression vector encoding a silk protein of the invention mayconsist of a recombinant nucleic acid molecule coding for said protein(for example a polynucleotide provided in any one of SEQ ID NO's: 11 to21, 31 to 39, 48 to 55, 64 to 71, 74 or 75) placed downstream of theCaMV 35S promoter in a binary vector backbone containing akanamycin-resistance gene (NptII).

For the polynucleotides comprising any one of SEQ ID NO's 11, 13, 15,17, 19, 31, 33, 35, 37, 48, 50, 52, 54, 64, 66, 68, 70 or 74 theconstruct further may comprise a signal peptide encoding region such asArabidopsis thaliana vacuolar basic chitinase signal peptide, which isplaced in-frame and upstream of the sequence encoding the silk protein.

The construct carrying a silk protein encoding polypeptide istransformed separately into Agrobacterium tumefaciens by electroporationprior to transformation into Arabidopsis thaliana. The hypocotyl methodof transformation can be used to transform A. thaliana which can beselected for survival on selective media comprising kanamycin media.After roots are formed on the regenerates they are transferred to soilto establish primary transgenic plants.

Verification of the transformation process can be achieved via PCRscreening. Incorporation and expression of polynucleotide can bemeasured using PCR, Southern blot analysis and/or LC/MS oftrypsin-digested expressed proteins.

Two or more different silk protein encoding constructs can be providedin the same vector, or numerous different vectors can be transformedinto the plant each encoding a different protein.

As an experimental example of plant expression, a codon-optimisedversion of AmelF4 (Xenospira4) (SEQ ID NO:76) was cloned into pET14b(Novagen), generating pET14b-6×His:F4op, forming an in-frametranslational fusion with a 6×histidine at the N-terminal of theprotein. The sequence encoding the protein “6×Histidine:F4op” was clonedinto pVEC8 (Wang et al., 1992) under the control of the CaMV 35Spromoter and ocs polyadenylation regulatory apparatus, generatingpVEC8-35S-6×His:F4op-ocs. pET14b-6×His:F4op was transformed intochemically-competent E. coli and pVEC8-35S-6×His:F4op was transformedinto tobacco leaf discs by Agrobacterium mediated transformation.Proteins from antibiotic resistant E. coli (induced expression) andtobacco leaves were isolated and subjected to western blot analysisusing the Tetra-Histidine antibody (Qiagen, Karlsrule, Germany) fordetection. The empty vectors pET14b and pVEC8-35S-ocs were used asnegative controls in there respective host backgrounds. As shown in FIG.13, these experiments resulted in the plant producing the Xenospira4(AmelF4) protein.

Example 9 Fermentation and Purification of Silk Proteins

Expression constructs were constructed after the silk coding regions ofhoneybee genes AmelF1-F4 (Xenospira1 to 4 respectively) and lacewingMalF1 genes were amplified by PCR and cloned into pET14b expressionvectors (Novagen, Madison, Wis.). The resultant expression plasmids werethen electroporated into E. coli BL21 (DE3) Rosetta cells and grownovernight on LB agar containing ampicillin. A single colony was thenused to inoculate LB broth containing ampicillin then grown at 37° C.overnight. Cells were harvested by centrifugation and lysed withdetergent (Bugbuster, Novagen). Inclusion bodies were washed extensivelyand re-solubilised in 6M guanidinium.

This procedure yielded proteins mixtures with greater than 95% purity ofthe honeybee proteins and greater than 50% purity of the lacewing MalF1protein. Yields of up to 50% of the wet weight of the E. coli cellpellet were regularly obtained, indicating that the proteins are easy toexpress in this manner.

The solubilised honeybee recombinant proteins were applied to a Talonresin column prepared according to manufactures directions. They werethen eluted off the column in 100 mM Tris.HCL, 150 mM imidazole pH 8.

Example 10 Processing of Silk Proteins into Threads

The honeybee and lacewing silk proteins have been readily made intothreads using a variety of methods (see FIG. 14) using the followingprocedure.

The anterior segment of the salivary gland from late final instar Apismellifera was dissected under phosphate buffered saline and removed to aflat surface in a droplet of buffer. Forceps were used to grasp eitherend of the segment. One end was raised out of the droplet and away fromthe other at a steady rate. This enabled the drawing of a fine threadthat rapidly solidified in air.

The honeybee and lacewing larval recombinant silk proteins formedthreads or sheets after dehydration or concentration. For example, bydropping soluble protein into a butanol solution or by concentratingproteins on the Talon resin column.

Threads were also obtained after honeybee or lacewing recombinant silkproteins were mixed with an organic solvent (such as hexane) toconcentrate them at the interface in the correct conformation, and thenaddition of a reagent to exclude them from the interface (such asbutanol). The threads formed by this procedure had similar FT-IR spectrato the native silk indicating that they were comprised of the samecoiled coil structure.

Silk proteins from other species described herein can also be processedby this procedure.

It will be appreciated by persons skilled in the art that numerousvariations and/or modifications may be made to the invention as shown inthe specific embodiments without departing from the spirit or scope ofthe invention as broadly described. The present embodiments are,therefore, to be considered in all respects as illustrative and notrestrictive.

All publications discussed above are incorporated herein in theirentirety.

Any discussion of documents, acts, materials, devices, articles or thelike which has been included in the present specification is solely forthe purpose of providing a context for the present invention. It is notto be taken as an admission that any or all of these matters form partof the prior art base or were common general knowledge in the fieldrelevant to the present invention as it existed before the priority dateof each claim of this application.

REFERENCES

-   Atkins E. D. T. (1967) J Mol Biol 24:139-141.-   Bendtsen J. D., Nielsen H., von Heijne G. and Brunak S. (2004) J.    Mol. Biol. 340:783-795.-   Bini E., Knight D. P. and Kaplan D. L. (2004) J. Mol. Biol.    335:27-40.-   Craig C. L. and Riekel C. (2002) Comparative Biochemistry and    Physiology Part B 133:493-507.-   Delorenzi M. and Speed T. (2002) Bioinformatics 18:617-625.-   Deng Y., Liu J., Zheng Q., Eliezer D., Kallenbach N. R. and    Lu M. (2006) Structure 14:247-255.-   Flower N. E. and Kenchington W. R. (1967) Journal of the Royal    Microscopical Society 86:297.-   Grimaldi D. and Engel M. S. (2005) Evolution of insects. Cambridge    University Press, New York.-   Harayama S. (1998) Trends Biotech., 16; 76-82.-   Heimburg T, Schunemann J., Weber K., and Geisler N. (1999)    Biochemistry 38:12727-12734.-   Hepburn H. R., Chandler H. D. and Davidoff M. R. (1979) Insect    Biochem. 9:66.-   Kneller D. G., Cohen F. E. and Langridge R. (1990) J. Mol. Biol.    214:171-182.-   LaMunyon C. W. (1988) Psyche 95:203-209.-   LaMunyon C. W. and Adams P. A. (1987) Annals of the Entomological    Society of America 80:804-808.-   Lucas F. Shaw J. T. B. and Smith S. G. (1960) J. Mol. Biol.    2:339-349.-   Lucas F. and Rudall K. M. (1967) In Comprehensive Biochemistry (Ed.    Florkin M and Stotz H) Vol 26B pp 475-559 Elsevier Amsterdam.-   Lupas A., Van Dyke M. and Stock J. (1991) Science 252:1162-1164.-   McClelland J. L. and Rumelhart D. E. (1988) Explorations in Parallel    Distributed Processing vol 13. pp 318-362. MITPress, Cambridge Mass.-   Needleman, S. B. and Wunsch, C. D. (1970) J. Mol. Biol., 48;    443-453.-   Quicke D. L. J., Shaw M. R., Takahashi M. and Yanechin B. (2004)    Journal of Natural History 38:2167-2181.-   Reiser K., McCormick, Rucker R. B. (1992) The FASEB Journal    6:2439-2449.-   Rost B. and Sander C. (1993) J. Molecular Biology 232:584-599.-   Rost B., Yachdav G. and Liu J. (2004) Nucleic Acids Research 32 (Web    Server issue):W321-W326.-   Rudall K. M. (1962) In Comparative Biochemistry (Ed. By Florkin M    and Mason HS) Vol 4, pp. 297-435. Academic Press, New York.-   Rudall K. M. and Kenchington W. (1971) Annual Reviews in Entomology    16:73-96.-   Silva-Zacarin E. C. M., Silva De Moraes R. L. M. and    Taboga S. R. (2003) J. Biosci. 6:753-764.-   Speilger P. E. (1962) Annals of the Entomological Society of    America. 55: 69-77.-   Wang M. B., Li Z. Y. et al. (1998). Acta Hort. 461: 401-407.-   Yamada H., Shigesada K., Igarashi Y., Takasu Y., Tsubouchi K. and    Kato Y. (2004) Int. J. Wild Silkmoth and Silk 9:61-66.

1-45. (canceled)
 46. A silk polypeptide, wherein at least a portion ofthe polypeptide has a coiled coil structure, and wherein the polypeptidecomprises an amino acid sequence which is at least 40% identical to anyone or more of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:22, SEQ ID NO:23, SEQID NO:40, SEQ ID NO:41, SEQ ID NO:56, SEQ ID NO:57; SEQ ID NO:3, SEQ IDNO:4, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:42, SEQ ID NO:43, SEQ IDNO:58, SEQ ID NO:59, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:26, SEQ IDNO:27, SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:60, SEQ ID NO:61, SEQ IDNO:7, SEQ ID NO:8, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:46, SEQ IDNO:47, SEQ ID NO:62, or SEQ ID NO:63, and wherein the silk polypeptideis fused to at least one other polypeptide; or wherein the silkpolypeptide is present in a silk fiber or copolymer and crosslinked to asurface of interest.
 47. The silk polypeptide of claim 46, wherein i)the portion of the polypeptide that has a coiled coil structurecomprises at least 10 copies of the heptad sequence abcdefg, and whereinat least 25% of the amino acids at positions a and d are alanineresidues, or ii) the portion of the polypeptide that has a coiled coilstructure comprises at least 10 copies of the heptad sequence abcdefg,and at least 25% of the amino acids at positions a, d and e are alanineresidues.
 48. The silk polypeptide of claim 46 which is fused to atleast one other polypeptide.
 49. The silk polypeptide of claim 46,wherein the silk polypeptide is present in a silk fiber or copolymer andcrosslinked to a surface of interest.
 50. A product comprising at leastone silk polypeptide of claim
 46. 51. The product of claim 50, whereinthe product is selected from the group consisting of: a personal careproduct, textiles, plastics, and biomedical products.
 52. A vectorcomprising: a polynucleotide which encodes a silk polypeptide, whereinat least a portion of the polypeptide has a coiled coil structure, andwherein the silk polypeptide comprises an amino acid sequence which isat least 40% identical to at least any one or more of SEQ ID NO:1, SEQID NO:2, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:40, SEQ ID NO:41, SEQ IDNO:56, SEQ ID NO:57; SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:24, SEQ IDNO:25, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:58, SEQ ID NO:59, SEQ IDNO:5, SEQ ID NO:6, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:44, SEQ IDNO:45, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:7, SEQ ID NO:8, SEQ IDNO:28, SEQ ID NO:29, SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:62, or SEQ IDNO:63; and a heterologous promoter operably linked to thepolynucleotide.
 53. A composition comprising the vector of claim 52, andone or more acceptable carriers.
 54. A recombinant host cell comprisinga polynucleotide which encodes a silk polypeptide, wherein at least aportion of the polypeptide has a coiled coil structure, and wherein thesilk polypeptide comprises an amino acid sequence which is at least 40%identical to at least any one or more of SEQ ID NO:1, SEQ ID NO:2, SEQID NO:22, SEQ ID NO:23, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:56, SEQ IDNO:57; SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:24, SEQ ID NO:25, SEQ IDNO:42, SEQ ID NO:43, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:5, SEQ IDNO:6, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:44, SEQ ID NO:45, SEQ IDNO:60, SEQ ID NO:61, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:28, SEQ IDNO:29, SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:62, or SEQ ID NO:63; andwherein a) the polynucleotide is operably linked to a heterologouspromoter, and/or b) the recombinant host cell is a bacterial, yeast orplant cell.
 55. The recombinant host cell of claim 54, wherein the silkpolypeptide comprises an amino acid sequence which is at least 50%identical to at least any one or more of SEQ ID NO:1, SEQ ID NO:2, SEQID NO:22, SEQ ID NO:23, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:56, SEQ IDNO:57; SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:24, SEQ ID NO:25, SEQ IDNO:42, SEQ ID NO:43, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:5, SEQ IDNO:6, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:44, SEQ ID NO:45, SEQ IDNO:60, SEQ ID NO:61, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:28, SEQ IDNO:29, SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:62, or SEQ ID NO:63. 56.The recombinant host cell of claim 54, wherein the silk polypeptidecomprises an amino acid sequence which is at least 70% identical to atleast any one or more of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:22, SEQ IDNO:23, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:56, SEQ ID NO:57; SEQ IDNO:3, SEQ ID NO:4, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:42, SEQ IDNO:43, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:5, SEQ ID NO:6, SEQ IDNO:26, SEQ ID NO:27, SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:60, SEQ IDNO:61, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:28, SEQ ID NO:29, SEQ IDNO:46, SEQ ID NO:47, SEQ ID NO:62, or SEQ ID NO:63.
 57. The recombinanthost cell of claim 54, wherein the silk polypeptide comprises an aminoacid sequence which is at least 80% identical to at least any one ormore of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:22, SEQ ID NO:23, SEQ IDNO:40, SEQ ID NO:41, SEQ ID NO:56, SEQ ID NO:57; SEQ ID NO:3, SEQ IDNO:4, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:42, SEQ ID NO:43, SEQ IDNO:58, SEQ ID NO:59, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:26, SEQ IDNO:27, SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:60, SEQ ID NO:61, SEQ IDNO:7, SEQ ID NO:8, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:46, SEQ IDNO:47, SEQ ID NO:62, or SEQ ID NO:63.
 58. The recombinant host cell ofclaim 54, wherein the polynucleotide comprises a nucleic acid sequencewhich is at least 40% identical to any one or more of SEQ ID NO:11, SEQID NO:12, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:48, SEQ ID NO:49, SEQ IDNO:64, SEQ ID NO:65, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:33, SEQ IDNO:34, SEQ ID NO:50, SEQ ID NO:51, SEQ ID NO:66, SEQ ID NO:67, SEQ IDNO:15, SEQ ID NO:16, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:52, SEQ IDNO:53, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:17, SEQ ID NO:18, SEQ IDNO:37, SEQ ID NO:38, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:70, SEQ IDNO:71, or SEQ ID NO:76.
 59. The recombinant host cell of claim 54,wherein the polynucleotide comprises a nucleic acid sequence which is atleast 50% identical to any one or more of SEQ ID NO:11, SEQ ID NO:12,SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:64,SEQ ID NO:65, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:33, SEQ ID NO:34,SEQ ID NO:50, SEQ ID NO:51, SEQ ID NO:66, SEQ ID NO:67, SEQ ID NO:15,SEQ ID NO:16, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:52, SEQ ID NO:53,SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:37,SEQ ID NO:38, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:70, SEQ ID NO:71, orSEQ ID NO:76.
 60. The recombinant host cell of claim 54, wherein thepolynucleotide comprises a nucleic acid sequence which is at least 70%identical to any one or more of SEQ ID NO:11, SEQ ID NO:12, SEQ IDNO:31, SEQ ID NO:32, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:64, SEQ IDNO:65, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:33, SEQ ID NO:34, SEQ IDNO:50, SEQ ID NO:51, SEQ ID NO:66, SEQ ID NO:67, SEQ ID NO:15, SEQ IDNO:16, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:52, SEQ ID NO:53, SEQ IDNO:68, SEQ ID NO:69, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:37, SEQ IDNO:38, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:70, SEQ ID NO:71, or SEQ IDNO:76.
 61. The recombinant host cell of claim 54, wherein thepolynucleotide comprises a nucleic acid sequence which is at least 80%identical to any one or more of SEQ ID NO:11, SEQ ID NO:12, SEQ IDNO:31, SEQ ID NO:32, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:64, SEQ IDNO:65, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:33, SEQ ID NO:34, SEQ IDNO:50, SEQ ID NO:51, SEQ ID NO:66, SEQ ID NO:67, SEQ ID NO:15, SEQ IDNO:16, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:52, SEQ ID NO:53, SEQ IDNO:68, SEQ ID NO:69, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:37, SEQ IDNO:38, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:70, SEQ ID NO:71, or SEQ IDNO:76.
 62. The recombinant host cell of claim 54, wherein the portion ofthe silk polypeptide that has a coiled coil structure comprises at least10 copies of the heptad sequence abcdefg, and wherein at least 25% ofthe amino acids at positions a and d are alanine residues.
 63. Therecombinant host cell of claim 54, wherein the portion of the silkpolypeptide that has a coiled coil structure comprises at least 18copies of the heptad sequence abcdefg, and wherein at least 25% of theamino acids at positions a and d are alanine residues.
 64. Therecombinant host cell of claim 54 which is a bacterial cell.
 65. Aprocess for preparing a silk polypeptide comprising cultivating therecombinant host cell of claim 54, under conditions which allowexpression of the polynucleotide encoding the polypeptide, andrecovering the expressed polypeptide.